RD-A152  124 
UNCLASSIFIED 


AN  ANALYSIS  OF  DATA  DICTIONARIES  AND  THEIR  ROLE  IN 
INFORMATION  RESOURCE  MANAGEMENT (U)  NAVAL  POSTGRADUATE 
SCHOOL  MONTEREY  CA  S  L  LAND IN  ET  AL.  SEP  84 

F/G  5/2 


i/2 


NL 


MICROCOPY  RE. SOLUTION  TEST  CHART 

N*« I HjNAL  H'.IKIAIJ  i;»  MAM-Akt.--.  ]-M  - 


AD-A152  134 


NAVAL  POSTGRADUATE  SCHOOL 

Monterey,  California 


DTIC 

k  e:  lectei 

APR  5  1985 


f 


\pproveJ  for  public  release;  distribution  unlimited 


% S'  03  18  08 


SECU  Rl  T  v  CL  »SSI  FiC  ATiqn  ^  F  ’11S  P*OE  (When  Dali  Enl«a#d) 


REPORT  DOCUMENTATION  PAGE 

READ  INSTRUCTIONS 

BEFORE  COMPLETING  FORM 

t  REPORT  NUMBER  |2.  govt  ACCESSION  NO. 

!-1h.  -1  n  2  > 

3.  RECIPIENTS  CATALOG  NUM9ER 

1 3y 

4  TITLE  and  Subtitle) 

An  Analysis  of  Data  Dictionaries  and 
Their  Role  in  Information  Resource 
Managemen t 

5.  type  op  report  &  °eriod  covered 

6-  PERFORMING  ORG.  REPORT  NUMBER 

7.  AUTHORS; 

Suzanne  L.  Land  in 
ftonald  L.  Owens 

8-  CONTRACT  OR  GRANT  NUMBERS 

*  performing  organization  NAME  AND  AOORESS 

Naval  Postgraduate  School 

Monterey,  CA  93945 

10.  program  Element,  project,  task 

AREA  8  WORK  UNIT  NUMBERS 

11  CONTROLLING  OFFICE  NAME  ANO  AOORESS 

Naval  Postgraduate  School 

Monterey,  CA  95943 

12.  REPORT  DATE 

.September  1984 

13  NUMBER  of  PAGES 

108  i 

l«  MONITORING  AGENCY  name  »  ADDRESS  ft!  dlllerent  tr  om  Controltlni  Ottlce) 

15-  SECURITY  CLASS,  (of  thte  report)  J 

Unclassi f ied 

1 5m.  DECLASSIFICATION  DOWNGRADING 
SCHEDULE 

'6.  DISTRIBUTION  Statement  ol  tht.  Report) 


Approved  for  public  release;  distribution  unlimited 


17.  DISTRIBUTION  STATEMENT  fot  the  mbatract  enferad  In  Block  20.  If  dlfferant  from  Report) 


18.  supplementary  no^es 


19  KEY  WORDS  'Continue  -r  reverie  atde  If  neceeaary  and  Identify  by  block  number) 

Data  Dictionary;  Distributed  Data  Processing;  information 
Resource  Management;  Federal  Information  Processing  Standard 
fPIPS)  for  Data  Dictionary  Systems 


20  ABSTRACT  f Continue  an  reverie  aide  If  neceaaery  end  Identify  by  block  number ) 

The  goal  of  efficient  management  of  an  organization’s  information 
resource  can  be  accomp 1  i shed  through  the  implementation  and  use 
of  a  data  dictionary.  This  thesis  defines  the  structure  and 
functions  of  a  data  dictionary  and  analyzes  the  attempt  of  the 
National  Bureau  of  Standards  to  promulgate  a  standard  software 
specification  for  use  in  the  evaluation  and  selection  of  data 
dictionaries  in  the  federal  government.  Criteria  for  the  "ideal" 


00 


f  ORM 

1  JAN  73 


1473 


EDITION  Of  I  NOV  85  IS  OBSOLETE 


M  1  c-  -'i  I-  -60! 


SECURITY  CLASSIFICATION  OF  THIS  PACE  r»Ti.n  Date  Snfrtd) 


SECURITY  CLASSIFICATION  of  This  PAGE  Dal*  Enftmd) 


data  dictionary  are  developed  based  on  the  role  a  dictionary 
can  play  in  information  resource  management  and  are  then  used 
to  evaluate  four  commercial  data  dictionary  packages.  Finally, 
some  ideas  concerning  possible  applications  for  data  dictionary 
technology  are  presented. 


N  J  10  2-  L r  ’  U  •  >>  6  0  I 

SECURITY  CLASSIFICATION  of  This  R AGEfRTian  Dafa  En.’a'ac 

2 


Apprcveu  r  or  p  u  buc  reiease;  distribution  unlimited. 

An  Analysis  of  Data  Dictionaries 
and  Their  Role  in 
Information  Resource  Management 


Suzanne  1-  L  an  din 
Lieutenant,  United  States  Law 
3.  A.  ,  University  of  Washinjtor.,  1973 


Ronald  L-  Owens 

Lieutenant  Commander,  United  States  Navy 
3.3.  A.,  Harcin-Si  mmons  University,  1971 


Submitted  in  partial  fulfillment  of  the 
requirements  for  tne  degree  of 

MASIER  OF  SCIENCE  IN  INFORMATION  SYSTEMS 

from  the 

NAVAL  POSTGRADUATE  SCHOOL 
September  1984 


Autnors : 


"DuzanneTrTan  HIE' 


0^o»Ew  X- 
ITonaI3-T7~JJwens" 


Approved  by: _ > 


Daniel  7.  “Dold,  “TEiesIs  Advisor' 


7— !Tar c 7iman“rer cy  ,  Decor \E  Deader" 


L J  AsU^.  j/]  /Q  ^ j, ^  A 


Atliri  R.  "GTeer,  ~Jr . , 'iCdalr  mad. 
Department  of  Administrative  Sciences 


Kneale  I.  DlrsaalT, 

Dean  of  Information  and  Policy  Sciences 


ABSTRACT 


The  goal  of  efficient  management  of  an  organization's 
information  casoarce  can  ba  accomplisaed  through  the  imple¬ 
mentation  and  use  of  a  data  dictionary.  This  thesis  defines 
tha  structura  and  functions  of  a  data  dictionary  and 
analyzes  the  attempt  of  tha  National  3ureau  of  Standards  to 
promulgate  a  standard  software  spacification  for  use  ir.  the 
evaluation  and  selection  of  data  dictionaries  in  the  federal 
government.  Criteria  for  the  "ideal"  data  dictionary  are 
developed  tasad  on  the  role  a  dictionary  can  play  in  infor¬ 
mation  resource  management  and  ace  then  used  to  evaluate 
four  commercial  data  dictionary  packages.  Finally,  some 
ideas  concerning  possible  applications  for  data  dictionary 
technology  are  presented. 


4 


TABLE  OF  CONTENTS 


THE  NEED  FOfi  A  DATA  DICTIONARY  . 

A.  CASK  GROUND . 

3.  PURPOSE  OF  THE  THESIS  . 

THE  ANATOMY  OF  A  DATA  DICTIONARY  . 

A.  INTRODUCTION  . 

5.  THE  STRUCTURE  OF  A  DATA  DICTIONARY  .  .  .  . 

C.  THE  FUNCTIONS  OF  A  DATA  DICTIONARY  .  .  -  . 

1.  Definition  . 

2.  Update . 

3.  Retrieval . 

4.  Software  Interface  . 

FEDERAL  INFORMATION  PROCESSING  STANDARD  FOR 

DATA  DICTIONARY  SYSTEMS  . 

A.  INTRODUCTION  . 

3.  SYSTEM  STANDARD  SCHEMA  . 

C.  COMMAND  LANGUAGE  INTERFACE  S P ECIFIC A T 10 N S 

D.  INTERACTIVE  INTERFACE  SPECIFICATIONS  .  .  . 

E.  DICTIONARY  ADMINISTRATOR  SJPPOST 

SPECIFICATIONS  . 

F.  EVALUATION  . 

THE  ROLE  OF  THE  DATA  DICTIONARY  IN  INFORMATION 

RESOURCE  MANAGEMENT  . 

A.  INFORMATION  RESOURCE  MANAGEMENT  . 

1.  Planning  Phase . 

2.  Study  Phase  . 

3.  Design/Coding  Phase  . 

4.  Operation  and  Maintenance  . 


D-  OBJECTIVES  0  F  A  DATA  DICTUM  ART . 37 

1.  Data  Security . 37 

2.  Data  Integrity . 38 

3.  Documentation/Maintaaaace  .  40 

C.  THE  IDEAL  DATA  DICTIONARY . 41 

1.  System  Standard  Schama  and 

Extensibility  .  43 

2.  Command  and  Query  Languages  .  43 

3.  Ease  of  Use . 44 

4.  Security  .....  .  44 

5.  Documentation  and  Reports . 44 

6.  Application  Interfaces  .  45 

V.  EVALUATION  OF  COMMERCIAL  DATA  DICTIONARIES  ....  46 

A.  DATA  DESIGNEP . 46 

B.  MS?  DA  TAM  ANA  3  E  R . 54 

C.  ADS  DATA  DICTIONARY . 65 

D.  ORACLE . 76 

E.  COMPARISON  OF  DATA  DESIGNER,  DA  TAMAN  A  3 E R , 

DATADICIIONAFY,  AND  ORACLE  .  83 

VI.  EXPANSIONS  OF  THE  ROLE  OF  DATA  DICTIONARIES  ...  SO 

A.  DISTRIBUTED  DATA  PROCESSING  .  90 

B.  DECISIDH-MAKIrJG . 97 

1.  The  Decision-Making  Process  .  93 

2.  Crisis  Management  .  100 

C.  CONCLUSIONS .  101 

APPENDIX  A:  BACKUS-NAUR  FORM .  104 

LIST  OF  REFERENCES .  105 

INITIAL  DISTRIBUTION  LIST  .  109 


b 


r 


LIST  OF  TABLES 


1.  Standard  Commands  of  DATA  D5SI3SER . a"7 

2.  DATA  DESIGNEE  Modeling  Codes  .  50 

3.  DATA  DESIGNEE  Generate  Options  .  52 

4.  Reports  Available  with  DATA  DESIGNEE . 52 

5.  DAT AN AN  A  3  EH  Standard  Commands . 57 

6.  DATAMAMA3EE  Standard  Schema  Descriptors  .  57 

7.  DATAMAN  A  SEE  Maintenance  Commands . 64 

8.  DATAMASA3ER  Seport/Query  Commands  .  64 

5.  Components  of  ADR’s  DATCDM  System  .  67 

ID.  ADE  DATADICTIONAR Y  Standard  Entity-types  .  68 

11.  Principal  Reports  of  DATA  DICTIONARY . 75 

12.  Category  One:  Schemas  and  Extensibility  .  84 

13.  Category  Two:  Commaad/Quer y  Languages  .  85 

14.  Category  Three:  Relative  Ease  of  Dse . 86 

15.  Category  Four:  Security  .  86 

16.  Category  Five:  Documentation  and  Reports  .  37 

17.  Category  Six:  Application  Interfaces  .  37 

13.  Data  Dictionary  Comparison  TotaLs  .  83 


. - i 

;  .  mutton/  _  I 

v*'.  lability  Codes  J 


Avail  and/or 
Special 


LIST  OF  FIG03ES 


2.1  Types  of  Data  Dictionary  Metadata . 13 

2.2  Free-standing  and  Dependent  Data 

Dictionaries  .  14 

2.3  Views  Within  a  DBMS . 16 

2.4  Sample  Schema  Descriptors  .  18 

2.5  Comparison  of  Data  Levels . 20 

5.1  DATA MAN  ACER’S  Hierarchy  of  Eatity-ty^es  .  .  .  .  53 

5.2  STUDENT  example  ia  DATAM AN  AGES  Structure  ....  62 

5.3  A  Logical  Hierarchy  of  Entity-types  .  69 

5.4  ADR  DATADICTIONARY  Master  Mena . 71 

5.5  ORACLE'S  Logical  Hierarchy  .  77 

5.5  Tables  of  the  ORACLE  Data  Dictionary . 78 

5.7  ORACLE  CATALOG  Listing  .  7S 

5.3  ORACLE  SYSCATALOG  Listing . 79 

5.9  ORACLE  CATALOG  Listing  W^th  New  View . 80 

5.10  ORACLE  SYS  TABAOTH  Listing  for  User  Owens  ....  81 

6.1  Duplicated  Data  Dictionaries  .  94 

6.2  Partitioned  Data  Dictionary  (DD| . 95 

6.3  Hierarchy  of  Distributed  Data  Dictionaries  ...  96 


8 


I.  THE  NEED  FDR  A  DATA  DICTIONARY 

A.  BACKGROUND 

One  of  th a  most  important  resDur:es  of  an  o r g ani zat ion 
an 3  one  that  is  too  often  overlooked  is  data.  People, 
dollars,  materia  Is,  and  time  are  usually  well  controlled  and 
budgeted,  yet  the  data  about  an  organization  and  its  opera¬ 
tions  is  often  managed  haphazardly,  if  at  all. 

Database  technology  has  made  possible  the  storage  and 
processing  of  an  organization's  data  as  an  integrated  whole 
and  allows  the  sharing  of  that  processed  data,  or  informa¬ 
tion,  throughout  the  organization.  A  database  management 
system  (DBMS)  acts  as  a  librarian  for  the  database,  storing 
and  retrieving  data  according  to  a  particular  format 
[Ref.  1].  However,  a  DBMS  does  not  necessarily  provide  for 
the  security,  integrity,  accountability,  or  maintainability 
of  data.  These  objectives  are  best  achieved  when  a  data 
dictionary  is  used  in  conjunction  with  the  DBMS. 

Simply  stated,  a  data  dictionary  is  a  central  repository 
of  descriptive  data  about  the  definition,  char acteristics, 
location,  and  usage  of  the  data  found  in  an  organization.  A 
fully  utilized  data  diction  ary  will  control  the  collection, 
maintenance,  and  retrieval  of  this  data.  For  example,  if 
the  aircraft  carrier  U.S.S.  Constellation  aad  a  data 
dictionary,  it  would  be  possible  to  as*.  questions  such  as 

tfhat  type  of  data  is  contained  in  a  "Controlled 
Equipage"  record? 

How  many  programs  use  the  "Personnel"  file? 

*hich  departments  receive  tne  'Ammunition  Transaction" 
re  por  t  ? 

ifhat  is  the  relationship  between  "Inventory  Item"  and 
"Reorder  Point"? 


In  which  records  is  the-  field  "Social  Security  Number" 

f  O  U  il  1  ? 

Mho  is  authorized  to  update  the  "Readiness  Status" 
field? 

i'  at  is  the  range  of  values  foe  ’Readiness  Status"  data? 

x.n  which  database  xs  the  "Preventive  Maintenance"  file 
found? 

Those  who  will  benefit  frox  the  answers  to  taese  questions 
include  not  only  the  ship’s  data  administrator ,  but  also 
programmers,  systems  development  personnel,  data  processing 
staff,  auditors,  and,  most  important,  end  users  at  every 
level  of  the  organization. 

Even  though  data  dictionary  software  has  been  available 
commercially  since  1970  and  the  advantages  and  benefits 
associated  with  data  dictionaries  are  widely  recognize:, 
most  organizations  nave  teen  slow  to  implement  tnem,  and  the 
Department  of  Defense  is  no  exception.  A  recent  study  by 
the  Committee  on  Review  of  Navy  Long-Range  Automatic  Data 
Processing  Planning  [Ref.  2]  points  out  that 

7irtually  every  action  by  a  commander,  manager,  or 
administrator  in  the  Navy,  as  in  anv  large  organization, 
involves  the  acquisition  and  uni e r sta ad ing  of  informa¬ 
tion:  information  about  the  organization,  about  its 
status,  about  its  resources,  about  its  environment.  dis 
actions  usually  result  in  be  creation  and  promulgation 
of  policies  ana  directives:  that  is,  information  for 
subordinates,  peers,  or  superiors. 

If  it  is  true  that  "the  benefit  decived  from  a  dictionary  is 
proportional  to  the  size  of  the  dictionary  itself,"  [Ref.  3] 
the  military  stands  to  gain  a  great  deal  from  the  implemen¬ 
tation  of  data  dictionaries. 

At  present,  there  is  no  consensus  in  computing  litera¬ 
ture  about  exactly  what  a  data  dictionary  should  do  or  what 
kind  of  data  dictionary  is  best  for  a  particular  organiza¬ 
tion.  There  uce  many  different  data  dictionary  rackajes  or 
tne  market  from  which  to  moose;  most  of  these  nave  similar 


feat  uces.  Tie  retort,  the  potential  ['arch  i so  v  of  0  :  iti 
dictionary  it,  in  need  of  guidance  «  r.  a  .1  making  tnu  choice. 
Tne  Unit  <-d  ft  at  as  Goverraaat  Las  racogaizei  t:.i  3  problem  an  i 
has  identified  standards  for  data  i  L  0 1  i  0  n  a  r  l  e  s  1 ...  Federal 
Information  Processing  Standards  promulgated  oy  tne  National 
Bureau  of  Standards.  .^n  understanding  of  tn  e  se  standards 
and  of  the  functions  and  objectives  of  a  fata  dictionary 
will  provide  the  reader  with  a  basis  on  whim  to  evaluate 
data  dictionary  packages  and  to  uso  taen  effectively. 

B.  PURPOSE  3F  THE  THESIS 

'.Je  believe  that  it  is  important  for  managers  in  the 
military  to  understand  what  a  data  dictionary  is  and  what  it 
can  do  to  help  an  organization  manage  its  data.  Tiius,  the 
purpose  of  tnis  thesis  is  to  provide  tne  reader  with  an 
understanding  of  the  structure  and  functions  of  a  data 
dictionary,  gaidelines  for  the  evaluation  and  selection  „  f  a 
data  dictionary,  and  an  analysis  of  several  conmerciai  data 
dictionary  products.  We  will  show  the  reader  how  the 
management  of  an  organization’s  data  cesource  can  be  accom¬ 
plished  by  means  of  a  data  dictionary  and  will  recommend 
ways  for  the  role  of  the  data  dictionary  to  re  expanded. 


II.  THE  ANATOM?  OF  A  DATA  DICTIONARY 


A.  INTRODUCTION 

Because  Tata  dictionary  technology  is  a  new  and  continu¬ 
ally  evolving  field,  it  suffers  from  a  lac/c  of  consistency 
in  its  terminology.  The  many  texts  and  articles  on  the 
subject  and  tae  various  commercial  data  dictionary  products 
use  a  wide  variety  of  differing  terms.  The  data  dictionary 
itself  is  known  as  a  data  dictionary/directory ,  a  data 
dictionary  system,  or  an  information  resource  management 
dictionary.  In  order  to  provide  a  base  of  reference  for  tne 
remainder  of  this  thesis,  we  will  present  our  own  set  of 
definitions  distilled  from  our  references. 

Data  dictionaries  run  the  gamut  from  manual,  on- pa  per 
systems  to  highly  sophisticated  software  and  can  be  used 
both  in  database  and  non-database  environments.  He  will 
discuss  automated  data  dictionaries  only  as  they  relate  to  a 
database,  where  they  have  the  most  to  offer  tne  potential 
user . 

In  order  to  assess  the  benefits  of  a  data  dictionary,  it 
is  necessary  to  understand  how  a  data  dictionary  is  orga¬ 
nized  and  what  its  capabilities  are.  A  data  dictionary  does 
not  contain  the  actual  data  that  constitutes  an  organiza¬ 
tion's  database;  instead,  it  is  itself  a  ataoase  called  a 
fie tadatajbase  that  contains  metadata,  or  data  about  the  data¬ 
base  data.  Two  types  of  metadata  are  found  in  a  data 
dictionary.  Dictionary  metadata  tells  what  data  exists,  the 
origins  of  the  data,  the  attributes  the  data  may  have,  how 
and  by  whom  tae  data  may  be  used,  what  the  structure  of  the 
data  is,  and  what  the  rela tionsai ps  between  the  data  are. 
Directory  metadata  tells  where  the  data  is  located,  how  it 


c  a  1  be  accessed,  and  what  its  physical  representation  wit  hi  r. 
the  computer  is.  Together,  these  two  types  of  metadata 


r — 

i 

DATA 

1  DI 

CTIONARf 

1 

T  - 

_  _1  __ 

— 

I 

Dicr  idnap.y 

DIRECTORY 

METADATA 

METADATA 

_ 

Data  Present 

Data  Location 

— 

Data  Origin 

- 

Access  Modes 

- 

Attributes 

- 

Physical 

- 

Security/Access 

Representation 

— 

Data  Structure 

Eela  tionships 

j 

Figure  2.  1  Types  of  Data  Dictionary  Metadata 


provide  the  means  for  accessing  and  controlling  the  data  ia 
the  database.  Figure  2.1  illustrates  this  division  of 
me  tadata . 

Data  dictionaries  fall  into  two  cate g ories--fr ee- 
standing  and  DBMS-dependent.  Figure  2.2  shows  a  partial 
listing  of  some  commercial  data  dictionary  packages 
according  to  type.  A  tending  data  dictionary  (also 
called  independent  or  stand-alone)  is  not  tied  to  any 
particular  database  management  system  (DBMS).  It  manages 
data  by  utilizing  software  routines  built  into  the  data 
dictionary  package  ar.d  thus  is  not  dependent  on  DBMS  soft¬ 
ware.  This  independence  provides  flexibility:  a  free¬ 
standing  data  dictionary  can  have  the  capability  to  support 
more  than  or.e  type  of  DBMS.  However,  this  flexibility  is 
gained  at  the  cost  of  duplication  of  data  descriptions  in 
the  database  and  the  data  dictionary. 


Free-standing  Data  Dictionaries 


DATA  CATAL030  Z  2  f 1 9  74) 

-  Synergetus  corporation 
DATA  DESIGNER  (1975) 

-  Database  Design,  Inc. 

PF.IDE-LD3IK  (  1974) 

-  M.  Ecyce  5  Associates,  Inc. 

DAT  AM  A 'I  A  3  EF.  (  1975) 

-  Management  Systems  Z  Programming,  LTD 
DBMS-Dependent  Data  Dictionaries 

ADABAS  (1978) 

-  Software  AG  of  North  America,  I  no. 
DATA  DICTION  ARY/DA TACOM  (  197  9) 

-  Applied  Data  Research  (ADR) 

ORACLE  (1933) 

-  Relational  Software,  Tnc. 

DE/D C  DATA  DICTIONARY  (1974) 

-  International  Business  Machines 
EDICT  (1976) 

-  infodata  Systems,  Inc. 


Figure  2.2  Free-standing  and  Dependent  Data  Dictionaries 

A  3SMS-dependent  data  dictionary  (also  called  merged  or 
integrated)  is  a  component  of  a  specific  database  management 
system;  it  uses  the  software  facilities  availaole  within  tne 
DBMS  to  manage  the  data  in  the  database.  This  type  of  data 
dictionary  minimizes  redundancy  and  limits  tie  number  of 
possible  ernes  because  data  descriptions  exist,  in  only  one 
place,  in  the  data  dictionary.  It  also  benefits  from  the 
sophisticated  backup  and  recovery  facilities  of  the  DBMS. 

A  data  dictionary  is  also  described  as  having  active  or 
passive  interfaces  or  a  combination  of  the  two.  Ar.  inter¬ 


face  is 


:faces  or  a  combination  of  the  two.  Ar.  inter¬ 
series  of  commands  which  connect  the  data 


dictionary  with  other  software  such  as  compilers,  operating 
systems,  report  generators,  and  other  programs.  The  data 
dictionary  supports  these  applications  by  providing  tne 
metadata  that  is  required  for  their  execution.  An  active 


14 


data  dictionary  is  one  ia  which  information  Ls  created, 
accessed,  or  modified  through  tne  data  dictionary  inter¬ 
faces.  Hew  or  changed  metadata  is  automatically  updated  and 
stored  in  the  data  dictionary.  This  is  not  true  of  a 
Eii.§iZ§  data  dictionary:  when  new  metadata  Ls  generated, 
the  data  dictionary  may  or  may  not  be  automatically  updated 
and  when  data  is  retrieved,  it  may  be  accessed  through  the 
data  dictionary  or  directly  from  the  database. 

There  are  many  perspectives  from  which  to  look  at  the 
data  that  resides  in  a  database.  There  is  the  physical  (or 
internal)  view  that  consists  of  the  actual  physical  repre¬ 
sentation,  format,  and  location  of  the  data  as  "seen"  by  the 
computer.  There  is  a  logical  (or  conceptual  or  global 
enterprise)  view  called  a  schema  which  describes  all  of  the 
data  in  the  database  in  its  logical  format,  i.e.,  what  types 
of  records  are  to  be  maintained,  the  contents  of  those 
records,  and  the  relationships  amoig  those  records.  This  is 
tne  data  as  it  would  be  presented  to  a  human,  not  its  actual 
computer  format.  In  most  cases,  only  the  database  adminis¬ 
trator  has  access  to  the  schema.  Another  view  is  the 
external  view,  also  called  a  subschema,  which  is  a  sunset 
of  the  logical  view  tailored  to  a  particular  user  or  appli¬ 
cation.  This  is  analogous  to  a  "window"  through  which  only 
a  portion  of  the  total  data  is  seen.  Subschemas  can  be 
utilized  to  implement  security  cy  restricting  3  user's 
access  to  data. 

Figure  2.3  shows  the  three  different  perspectives  of 
data  in  a  sample  database  of  students  at  the  Haval 
Postgraduate  School.  (A)  is  the  computer's  pnvsical  view 
and  thus  is  rot  visible  to  the  human  user.  (3)  shows  the 
overall  logical  view  of  this  small  database.  ( T)  is  a 
suoset  of  (B)  as  it  would  be  seen  by  a  user  who  is 
interested  in  only  a  portion  of  the  database--in  this  case, 
the  senior  Army  officer  who  wants  information  only  on  Army 
students. 


Physical  View  as  ' ssau' 
Within  the  Computer 


(A) 


Logical 

View  of  Stored 

Data 

NAME 

SSN 

SEE VIE 

5  RANK 

NAF.KZY,  Ronald  P. 

452-43-6029 

usa 

0-5 

J CHNSON ,  Bruce  K. 

348-57-8826 

US  N 

0-4 

BROWN ,  Jennifer  C. 

512-47-2228 

US  NR 

0-2 

DAVIS,  Thomas  E. 

662-76-8239 

US  AF 

0-3 

MASON,  Robert  J. 

823-48-3991 

USA 

0-3 

GEIE,  Thomas  W. 

773-34-8725 

US  N 

0-4 

LANE,  Donna  F. 

371-67-7476 

US  NR 

0-3 

WILLIAMS,  Buy  T. 

547-23-3419 

USA 

0-3 

(B) 

1  One  External  View  of  tne 

Data 

(subset 

of  the  logical 

view) 

NAME 

5  S  N 

RANK 

KAFFEY,  Ronald  ? 

.  462-43-50 

23 

9-5 

MASON,  Robert  J. 

323-48-3991 

9-3 

FILLlAMS,  jUV  T. 

54  7-  2  3  -34  1  3 

3-3 

(C) 


Figure  2.3  Views  Within  a  DBMS 


B.  THE  STRUCTURE  DF  A  DATA  DICTIONARY 

There  are  three  k  1 i  s  of  elements  upon  whim  the  strac 
tare,  or  scr.  ma,  of  a  1  a  t  a  dictionary  is  built:  entities 


16 


Ihe  basic  element  of  the 


attributes,  and  relationships, 
dictionary  is  the  entity.  Each  entity  has  a  unique  name  and 
represents  an  object  in  the  real  world,  such  as  a  person, 
thing,  or  idea  about  which  information  is  recorded.  For 
example,  in  our  Naval  Postgraduate  Scnooi  database  we 
collected  information  about  students.  Se  also  described  the 
students  by  name.  Social  Security  nuaDer,  service,  ar.d  rank. 
These  characteristics  of  an  entity  are  called  attributes, 
and  can  be  either  quantitative  or  qualitative. 

A  relationship  is  a  logical  Link  between  two  entities 
that  can  also  be  described  by  attributes.  A  relationship 
will  fall  into  one  of  three  categories  of  mappings:  one-to- 
one,  one- to-c an y  /  manv-to-one .  or  many- to- many.  A  one-to- 
one  relationship  exists  when  eaci  entity  or  attribute  is 
logically  linked  to  one  and  only  one  other  entity  or  attri¬ 
bute.  For  instance,  we  say  that  taere  is  a  one-to-one  rela¬ 
tionship  between  an  individual’s  social  security  number  and 
his  name.  In  a  one-to-many/many-to-one  relationship,  each 
entity  or  attribute  is  logically  linked  to  one  or  more  other 
entities  or  attributes.  An  example  of  this  is  the  relation¬ 
ship  between  the  instructor  of  a  class  and  .the  students  in 
that  class.  A  many-to-many  relationship  occurs  when  one  or 
more  entities  or  attributes  is  related  to  one  or  more  other 
entities  or  attributes.  For  example,  there  Is  a  many-to- 
many  relationship  between  the  attributes  "color"  and  "model" 
of  a  type  of  car--each  color  may  oe  available  on  many 
different  car  models  and  each  car  model  may  be  available  in 
many  different  colors. 

In  order  to  understand  the  generic  terms  we  have 
presented  in  their  proper  context,  it  is  Important,  to 
differentiate  between  the  dictionary  schema  itself,  the 
metadatabase  that  it  governs,  and  the  "real"  data  in  the 
organization’s  database.  These  concepts  are  made  even  more 
confusing  because  the  terminology  used  to  refer  to  these 


1  7 


three  levels  of  lata  differs  from  vendor  to  vendor  and  from 
author  to  autnor.  We  will  look,  it  taese  levels  using  the 
f.p eiied  Data  Research,  Inc.  DATADICTIONARY  terminology 
[Ref.  4]  because  it  provides  the  clearest  distinction 
between  the  three.  ( DAT  A D ICT 10 M A RY  will  be  discussed  in 
depth  in  Chapter  7.) 

At  the  highest  level  of  abstraction,  entities,  attri¬ 
butes,  and  relationships  are  grouped  by  type: 


the  dictionary  schema  can  then  be  thought  of  as 
containing  all*  existing  entity-types,  relationsn ip- 
tvpes,  and  a t tcibute- t yp e s ,  any  one  of  which  will  also 
be  referred  to  as  a  schema  descriptor  "Ref.  5]. 


The  scneca  descriptors  are  the  general  categories  of  data 
that  is  stored  in  the  metaia ta base.  figure  2.4  shows  exam¬ 
ples  of  some  standard  schema  descriptors. 


E  nt ity- tvocs 

A  ttribute-t  ypes 

Relat ionsi i o- tyoes  1 

rile 

Author 

Contains  I 

F.  ecord 

Description 

Owns 

Field 

Password 

Processes 

K odule 

S  tatus 

Derived  From 

Program 

7  ersion 

Resides 

Report 

F  requency 

Uses 

.J  ob 

Security  Class 

Includes 

Data vi ew 

A  lia  s 

Authority 

U  se  r 

S  ys  t  em 

Process 

Comment 

Effective  Date 
Usage  Statistics 

Accesses 

Figure  2.4  Saiple  Scheaa  Descriptors 


At  the  metadatabase  level,  we  look  at  specific  instances 
chera  descriptors.  Thus,  we  define  an  entity- occur  react 
specific  instance  of  the  general  category  entity-type. 
FOGRAr  is  tae  entity-type,  ACCOUNTS  RECEIVABLE  coula  me 
entity  ~  occurrence.  Similarly,  a  relationship  ^occurrence 


r elation snio- 


is  a  specific  ir.stance  of  the  general  category 
type.  The  r t*L a t  ionsf.i p- ty  pe  ASSESS  say  have  as  a 

relationship-occurrence  P 3 3 31 Ai- K S3 ESS 5S- FI L E.  At  this 

level,  we  also  tain  atout  the  speciric  characteristics  cf  an 
at t r ibute- t y pe .  An  at t r i bu te - t y p e  is  the  race  of  a  charac¬ 

teristic  of  an  entity-occurrence,  as  Social  Security  Number 
characterizes  a  student.  In  attribute -character  is  tic  is  not 
the  value  of  the  at tribu te- ty pe ,  nut  tne  parameters  of  an 
at tr i bute- type ,  suen  as  its  lengtn  ani  format.  For  example, 
the  attribute- t ype  Social  Security  Number  will  be  character¬ 
ized  as  eleven  digits  long,  of  tne  form  995-99-9999. 
Enc i t y- occurr en ces ,  relationship-occurrences,  and  attribute- 
characteristics  will  be  referred  to  as  the  descriptors  of 
the  metaaatabase. 

At  the  "ceal"  lata  level  of  the  organization's  database. 


we  think  in 

terms  of 

actual 

values  of  data. 

such  as 

"Jennifer  C. 

3rown", 

"547-23 

-3410" 

,  "left- a  an 

de  c  mot  Key 

wrench",  "IBM 

3033"  , 

or  "93943". 

These  are  all 

VdiUts  c f 

the  attributes 

of  an 

entity. 

and 

arc  called 

a  t  tribe  te  - 

values. 

An  example 

of  each 

of  the 

level 

s  of  data  is 

given  in 

Figure  2.5. 

We  will  u 

se  the 

genec  i 

c  terms  tr.  tit 

y,  attri- 

bute,  and  relationship 

in  this 

thesis 

where  it  ls 

not  r.f  :ts- 

sacy  to  distinguish  between  the  three  levels. 

When  a  data  dictionary  is  receive!  from  tie  vendor,  it 
contains  a  system  standard  schema  which  incl  l i*.-s  certain 
basic  entity-types,  attrib ute-types,  ind  relationship- types 
chosen  by  the  vendor.  A  lata  dictionary  is  extensible  if  ar. 
organization  is  able  to  customize  tne  schema  bv  defining  its 
own  entity-t ypes,  attribute-types,  and  r e la ti on sh i p- ty pes  in 
addition  to  those  included  in  the  system  standard  schema. 


9 


Schen.a  Descriptors 

Exam  ole 

Entity- type 

Attribute-type 

Relationship- type 

Record 

Name 

Contains 

5i§tadat abase  Descriptors 

Example 

Entity- occurrence 

A  t tribute- char ac teristic 

R elation sn ip- occ  urrence 

Student 

25  Chirac 
numeric 
St  udent-C 
Name 

ters,  ilpha- 

ontai ns- 

Database  Data 

Example 

Attribute- value 

Ronald  P. 

;1  a  r  <  e  y 

Figure  2.5  Comparison  of  Data  Levels 


C.  THE  FONCTIOSS  3 F  A  DATA  DICTIOSARr 

The  functions  performed  by  a  typical  data  dictionary 
fall  into  fouc  categories:  definition,  update,  retrieval, 
and  software  interface.  A  data  dictionary  should  be  evalu¬ 
ated  in  each  category  according  to  the  ease  and  success  with 
which  the  functions  are  performed. 

1.  Definition 

The  first  step  in  the  implementation  of  a  data 
dictionary  is  to  collect  information  about  some  portion  of 
an  organization's  data,  such  as  tae  IJ.S.S.  Constellation' s 
supply  department.  This  is  done  by  interviewing  supply 
department  personnel,  identifying  the  data  received  and 
produced  by  the  department,  and  analyzing  the  software  that 
manipulates  that  data.  Dnca  entities,  attributes,  ar.d  rela¬ 
tionships  have  been  defined,  these  data  elements  arc  entered 
into  the  data  dictionary  using  the  dictionary's  data  defini¬ 
tion  commands.  The  elements  are  classified  according  to  the 
entity- types,  attribute-types,  and  relationshi p- ty pes  of  the 


system  standard  srhena,  ?r  the  dictionary  admin  i  st  rat  or  nay 
use  customized  data  types  as  necessary,  assuiin  j  tne 
dictionary  is  extensible. 

2.  Update 

As  an  organization  evolves,  so  does  its  data.  One 
of  the  functions  of  the  data  dictionary  is  to  allow  the 
addition,  modification,  and  deletion  of  elements.  For 
instance,  a  new  Navy  regulation  might  require  the  supply 
department  to  keep  track  of  certain  data  about  a  new  inven¬ 
tory  item  and  to  report  this  data  quarterly.  Dr  perhaps  the 
administrative  department  will  have  to  change  zip  codes  to 
the  new  nine-digit  format  on  all  correspondence.  Each  of 
these  changes  will  be  introduced  via  modifications  to  the 
dictionary  schema. 

3.  Fet riev a  1 

Information  can  be  retrieved  from  a  data  dictionary 
by  using  query  language  commands  or  the  report-generating 
capability  of  the  dictionary.  \  dictionary  wili  provide 
structured,  commands  or  an  English- litre  query  language  that 
will  help  the  supply  department  to  find  out  tie  Navy  part 


number  for  a  monkey  wrench.  It  will  also  allow  the 
dictionary  administrator  to  find  out  wnich  users  have  access 
to  a  particular  subschema.  F.eports  are  produced  by  a  data 
dictionary  according  to  a  vendor- lef ine d  format  or  to  user 
specifications.  Retorts  generally  produce  a  larger  volume 
response  than  queries  and  are  often  printed  out  in  hard 
copy . 

4.  Software  Interface 

The  software  interface  function  provides  a  means  of 
access  to  the  data  dictionary  for  applicatioas  software, 
including  compilers,  editors,  and  database  management 


c 

m 


sy  st  o  us .  I:  ZD  PY  command  Ls  used  to  orir.j  diM  icsrripti  ui.s 
(e.j.,  of  records  or  filasi  directly  iato  the  program  being 
developed  fro?.  the  data  dictionary.  Has,  tie  job  of  me 
programmer  is  :n a d a  easier  and  data  use  is  standardized.  It 
is  also  possible  for  applications  software  to  directly 
retrieve  and  make  changes  to  toe  elements  in  a  data 
dirt  ionary. 


III.  FEDERAL  INFORMATION  PROCESS  I  NO  STANDARD  FOR  DATA 

DICTIONARY  SYSTEMS 


A.  INTRODUCTION 

The  Institute  for  Computer  Sciences  and  Technology  of 
the  National  Bureau  of  Standards  is  in  the  proress  of  devel¬ 
oping  a  standard  software  specif ication  for  data  diction¬ 
aries.  The  Federal  Information  Processing  Standard  for  Data 
Diotionary,  Systems  (FIPS  0D3)  is  intended  to  serve  as  a 
guideline  for  the  evaluation  and  selection  of  data  diction¬ 
aries  to  be  used  by  the  federal  government.  The  four 
voLuiaes1  "specify  and  describe  the  functionality,  datanase 
structure,  and  user  interfaces  of  the  FIPS  D  DS  '*  [fief.  6]. 

We  examined  three  volumes  of  the  FIPS  D 3 3 :  Command 
Language  Interface  Specif Lcations  (volume  2),  Interactive 
Interface  Descriptions  (volume  3*,  and  Dictionary 
Administrator  Support  Specifications  (volume  4).  The 
subject  of  each  of  the  volumes  corresponds  to  one  of  the 
three  categories  of  users  who  will  interact  with  a  data 
cictionary--the  experienced  user,  the  relatively  inexperi¬ 
enced  user,  and  the  administrator  of  the  data  dictionary. 

The  FIPS  DOS  describes  in  detail.  a  suggested  system 
standard  schema  for  a  data  dictionary,  including  definitions 
and  use  of  the  schema  descriptors.  Each  of  the  volumes 
presents  the  syntax  for  commands  necessary  for  its  target 
users  to  manipulate  the  dictionary.  In  addition,  th  p 
results  of  each  command  are  detaiLed,  with  error  messages 
and  "successful  completion"  messages  Listed  where 


apoli cable. 


!Note:  '/olume  1  is  not  yet  available  for  review.  The 
FIPS  PD S  is  in  draft  form  and  has  not  seen  formally  approved 
by  the  National  Bureau  of  Standards. 


23 


< 


B.  SYSTEM  STANDARD  SCHEMA 

The  system  standard  schema  sat  forth  in  the  FIPS  IDS 
provides  basic  entity- t ypes,  attribute- types  an  I 

relationship-types  as  follows: 

Entity- types 

1.  SYSTEM--a  collection  of  processes  and  lata 

2.  PF.OGRA  M--an  automated  process 

3.  M0D0LE--an  automated  process  which  is  a  logical 
subdivision  of  a  PROGRAM  or  an  independent  process 
called  ty  a  PROGRAM 

4.  riLE--an  organization's  data  collection 

5.  RECORD-- logically  associated  data  whin  belongs  to 
the  organization 

6.  DOCUMENT — human-readable  data  collections 

7.  ELEMENT  —  data  belonging  to  the  organization 

8.  USER--members  or  collections  of  members  belonging  to 
the  organization  using  the  facilities  available  in 
the  data  dictionary 

S.  DICTIONARY-USER  —  users  of  tne  dictionary  system 
itself 

10.  ACCES3-COMrROLLZR--specif  ies  access  res  trictior.s  to 
an  entity  or  set  of  entities  in  the  dictionary 

SYSTEM,  PROGRAM,  an  1  MODULE  are  of  the  class  "Process" ; 
FILE,  RECORD,  DOCUMENT,  and  ELEMENT  are  of  the  class  "Data"; 
USER  is  classed  as  "External",  and  DICTIDS A RY-DSEF  and 
ACCESS-CONTFOLLEP  are  of  the  class  "Security". 

Attribute^,  types 

There  are  55  attribute-types  inclua  cu  i  n  the  system 
standard  schema,  similar  to  the  ones  shown  in  Figure  2.4. 

Relationships types 

The  standard  relationsai p-types  provided  by  FIPS  are  as 


foLlovs: 


entities  compose! 


1  . 


3. 


4  . 


6  . 

7. 


CO ETA I h' 5-- describes 
of  ot!.  =  c  entities 
PhOCESSES  —  shows  tne  relationship 
ar.  d  data 


nee:  t  ■ ;  i  i  ■ 


between  a  nr  or ess 


SESPD :<  51 3LE-FOF.-  -  shows  the  association  between  enti¬ 
ties  representing  organizational  components  and 
entities  denoting  organizational  responsibility 
RUNS  —  shows  the  relationship  Detween  ?.  user  ar.  d  a 


process 

TO — shows  the  flow  between  two  processes 

DEE I7E D - Fa 0 M — sho ws  tuat  an  entity  is  tno  result  of 

some  operation  on  another  entity 

The  F I ? 3  CDS  includes  an  extensibility  facility  to 
provide  for  the  customization  of  the  system  standard 
schema  to  match  the  organization's  needs. 


C.  CC8HAHD  LANGGAGE  INTERFACE  S PEGIPICATIONS 

The  experienced  user  is  one  wno  is  familiar  with  the 
structure  and  commands  of  the  data  dictionary  and  who  needs 
access  to  the  full  functionality  of  the  data  dictionary. 
Command  language  commands  are  used  to  facilitate  this  access 
by  allowing  the  user  to: 

--define  lata  elements 

--maintain  the  dictionary  (a dd/modi f y/delet ei 
--report  on  dictionary  elements 
--query  the  dictionary  about  data  elements 
--build  entity  lists  aad  perform  operations  on  groupings 
of  entities  that  meet  certain  criteria  (useful  for  global, 
vice  individual,  operations) 

--support  applications  programs  that  interact  with  the 
data  dictionary 

--perform  general  utilities,  sum  as  changing  the  mode 
of  operation  and  obtaining  help  information. 


25 


E 

m 


The  syntax  of  each  of  the  command  language  commands  is 
presented  in  the  FIPS  DDS  using  Backus-Nauc  form.2  For 
example,  the  following  command  would  be  used  to  modify  ar. 
entity  that  already  exists  in  the  dictionary: 

f'D  D  IF Y- ENTITY 

[[WHERE]  NAME  [IS]  <entit y-name> 

[ADD  N  EW-7EB  SIDN  [  <  ver  sioc-nu  moer  >  ]  ] 

WHERE  ATTRIBUTES  [ARE]  < at tr i bu t e- cl ause- 1 > 

[,....,  [ <attribute-cla use-n> ]  ]} 
where : 

--entity-name  refers  to  a  single  entity  in  the. 
diet  ionary 

--NEW-VER3IDN  is  an  optional  clause  which  results  in  the 
creation  of  a  new  entity  which  has  a  primary-name  consisting 
of  the  assi gned-name  of  the  entity-name  specified  and  the 
next-highest  version-number 

--attribute-clause-n  refers  to  a  clause  used  to  desig¬ 
nate  the  attributes  of  the  specified  entity  which  are  to  be 
modified 


D.  INTERACTIVE  INTERFACE  5 PECIFICATIDNS 

The  interactive  interface  for  the  relatively  inexperi¬ 
enced  user  is  designed  to  lead  the  user  step-by-step  through 
the  desired  operations.  Without  having  to  master  the 
command  language  commands,  the  interactive  interface  user 
has  a  large  subset  of  the  total  functionality  available 
within  the  data  dictionary,  including  manipulation, 
reporting,  querying,  and  entity  list  operations.  The  FIPS 
DD5  recommends  that  this  interface  be  implemented  by  means 
of  "panels"  (screens)  that  are  presented  to  the  user  in 
sequence  and  which  contain  the  following  information  areas: 


2  Bac  kus- !J  a  u  r  form  is  explained  in  Appendix  A. 


1.  state  area  —  tells  the  usee  where  (in  wmch 

dictionary)  he  is  and  what  he  is  doing 

2.  data  area--for  entering  and  displaying  data 

3.  schema  area--used  mostly  for  dictionary  updates  to 
show  available  options  and  limitations  on  actions 

4.  message  area — for  error  messages  and  warnings 

5.  action  area--tells  the  user  how  to  proceed  from  the 
current  panel 

6.  help  area--for  the  display  of  help  information 
requested  by  the  user 

The  user  oegins  his  session  with  the  data  diction ary  at 
a  "home  panel"  which  provides  entry  into  the  system.  At  any 
point  along  tie  way  he  has  the  option  of  saving  or  undoing 
any  panel  with  which  he  has  been  working.  This  panel-driven 
interface  ensures  that  the  user  always  knows  where  he  is  in 
the  dictionary,  what  mistakes  he  has  made,  wmt  choices  he 
has  to  continue,  and  what  help  is  available  to  him. 


area  —  tells  the 


E.  DICTION  ART  AD1INISTR ATOB  SUPPORT  SPECIFICATIONS 

The  administrator  of  the  data  dictionary,  of  course,  has 
access  to  bot.n  the  standard  command  language  and  the  inter¬ 
active  interface.  His  or  her  main  concern,  however,  is  the 
management  of  the  schema.  This  is  accomplished  by  means  of 
a  specialized  set  of  commands  for 

--extending  the  system  standard  schema 
--reporting  on  the  schema 
--implementing  access  control  measures 

— controlling  export  from  and  import  to  the  dictionary. 
Ve  have  already  defined  the  extensibility  facility  as 
the  ability  to  add  schema  descriptors  to  the  system  standard 
schema.  The  report  facility  allows  the  administrator  to 
generate  a  Listing  of  the  entire  schema  or  any  subset 
thereof.  The  security  facility  provides  commands  for 


restricting  tie  access  of  users  to  the  dictionary  by  speci¬ 
fying  which  commands  the  user  is  allowed  to  execute.  The 
export/iaport  facility  allows  transfer  of  parts  of  one 
dictionary  to  another,  but  only  between  dictionaries  wnose 
scnema  are  identical  in  order  to  preserve  the  integrity  of 
the  "target"  dictionary. 

F.  E  VALUATION 

It  is  certainly  true  that  tae  FEPS  DDS  ocesents  me 
reader  with  very  detailed  specif ications  of  the  commands  and 
facilities  for  a  standardized  data  dictionary;  the  volumes 
we  reviewed  could  serve  as  the  basis  for  an  initial  design 
sp eci f icat i on  for  the  development  of  data  dictionary  soft¬ 
ware.  A  dictionary  based  on  the  FIPS  specif icat ions  would 
perforin  the  required  functions  discussed  in  Chapter  II  and 
would  contrioute  to  the  organization's  management  of  its 
data.  The  military  and  the  federal  government  would  benefit 
greatly  from  the  availability  of  standard  software  to 
achieve  control  over  its  data  resource. 

The  major  contribution  of  the  FIPS  DDS  is  its  orienta¬ 
tion  to  the  needs  of  the  different  Kinds  of  users  of  a  data 
dictionary.  Ibis  is  particularly  evident  in  the  interface 
that  is  suggested  for  use  by  inexperienced  users  of  the  data 
dictionary.  The  panel-lriven  format  with  its  six  informa¬ 
tion  areas  is  far  less  intimidating  than  the  syntax  required 
by  the  conaard  language.  Even  so,  the  interactive  interface 
still  requires  a  certain  degree  of  sophistication  on  the 
pact  of  the  "inexperienced"  user  if  ne  is  to  be  able  to 
manipulate  the  dictionary.  Another  strong  polit  of  the  FTPS 
DDS  is  its  consistency  of  presentation  and  format.  No 
matter  what  tae  operation,  the  procedures  needed  to  manipu¬ 
late  the  dictionary  and  the  manner  in  which  the  diet  ion ary 
"responds"  to  the  user  are  logical  and  predictable.  The 


commands,  however,  are  complex  and  require  knowledge  of 
Backus-Naur  form. 

Even  though  the  FIPS  DDS  does  indeed  provide  a  compre¬ 
hensive  software  standard  for  the  computer  professional,  we 
do  not  believe  that  it  achieves  its  goal  of  providing  a 
guide  for  the  evaluation  and  selection  of  data  dictionaries. 
Although  the  addition  of  the  introductory  volume  may  help 
remedy  the  pcoblem,  the  three  voLuaes  of  specifications 
ignore  the  forest  of  reasons  behind  the  implementation  of  a 
data  dictionary  while  concantra ting  solely  on  the  patterns 
of  the  leaves  on  each  tree.  me  FIPS  DDS  will  not  be 
extremely  useful  to  the  individual  searching  for  basic 
assistance  in  evaluating  commercial  data  dictionary  pack¬ 
ages.  Many  of  the  books  and  articles  we  have  reviewed 
provide  better  explanations  of  data  dictionary  features  and 
comprehensive  evaluation  criteria. 

Ke  found  that  the  terminology  that  the  FIPS  DDS  uses  for 
the  dictionary  schema  and  the  metadatabase  is  not  explained 
clearly  nor  is  it  any  less  confusing  than  that  of  any  other 
publication.  In  addition,  no  specific  examples  of  how  an 
organization’s  data  would  be  entered  in  the  data  dictionary 
are  given.  He  feel  that  it  is  more  important  for  the  poten¬ 
tial  data  dictionary  user  to  understand  how  a  data 
dictionary  will  assist  in  the  management  of  data  than  to  see 
samples  of  every  conceivable  type  of  error  message  that 
could  occur.  A  summary  recommended  features  such  as  the 
one  we  have  just  presented  and  a  list  of  criteria  for  evalu¬ 
ation  would  be  far  easier  for  the  reader  to  digest. 

None  of  the  data  dictionary  packages  we  nave  reviewed 
does  things  totally  the  "FIPS  way",  and  it  is  unlikely  that 
any  commercial  dictionary  vendor  will  ever  conform  exactly 
to  FIPS  DDS  guidelines.  However,  it  is  likely  that  the 
feieral  government  will  insist  that  FIPS  standards  be 
incorporated  Into  future  dictionaries  intended  for  govern- 


ment  use.  In  the  r.ext  chapter  we  will  develop  a  set  of 
criteria  for  an  "ideal"  data  dictionary,  taking  FIPS  DD3 
recommendations  into  account.  In  Chapter  V  we  will  ex  as  me 
foar  commercial  data  dictionary  packages  and  evaluate  their 
success  in  meeting  the  ideal  criteria. 


IV.  THE  HOLE  DF  THE  DATA  DICTIONARI  IN  INFORMATION  RESOURCE 

MANAGEMENT 

In  this  chapter  we  will  see  how  a  data  dictionary  can 
contribute  to  the  goal  of  efficient  management  of  an  organi¬ 
zation's  data.  We  will  first  discuss  tne  process  of  devel¬ 
opment  of  an  information  system  in  an  organization  and  then 
will  discuss  the  three  objectives  of  data  dictionaries  that 
we  have  identified  as  contributing  the  most  to  the  accom¬ 
plishment  of  this  goal:  data  security,  data  integrity,  and 
docunentation/maintenance.  We  will  then  develop  a  set  of 
criteria  for  the  "ideal"  data  dictionary  to  be  used  in  the 
evaluation  of  data  dictionary  packages. 

A.  INFORMATION  RESOURCE  MANAGEMENT 

Organizations  today  have  become  increasingly  aware  of 
the  need  to  manage  data  just  as  taey  manage  other  essential 
re  sources.  If  properly  managed,  the  necessary  data  will  be 
available,  up-to-date,  and  retrievaDle  when  required  to 
provide  information  that  is  of  value  to  the  organization. 
This  concept  is  known  as  Information  Resource  lunaqejment,  or 
IF:  1 ,  although  it  might  also  be  refereed  to  as  3ata  Resource 
Management. 

JRM  has  oeer:  the  focus  of  a  great  deal  of  interest  in 
recent  years.  In  October  of  1333,  the  Institute  for 
Computer  Scieaces  and  Technology  of  the  National  Bureau  of 
Standards  (tI33)  and  the  Association  for  Computing  Machinery 
(ACM)  co-sponsored  a  workshop  on  IRM  strategies  and  tools. 
It  was  b  .sed  on  the  premise  that 

IP  M  is  currently  one  of  the  most  significant  topics 
being  discussed  concerning  information  systems,  ana  is 
oeing  discussed  along  a  variety  of  lines  of  thought. 


3  1 


These  include  business  systems  planning;  information 
systems  anaLysis,  design,  and  development;  database 
design  and  implementation;  the  disciplines  of  office 
management,  paperwork  management,  and  information 
sciences  management;  and  the  various  problems  and  costs 
associated  with  implementing  IRM  to  include  each  of 
these  areas.  [Ref.  7] 


The  Proceedings  of  the  workshop  defined  IRM  as 


whatever  DoLicy,  action,  or  procedure  concerning  infor¬ 
mation  (both  automated  and  non- automated  supported^ 
which  management  establishes  to  serve  toe  overall 
current  and  future  needs  of  the  enterprise.  Such  poli¬ 
cies,  etc.,  would  include  considerations  of  avail¬ 
ability,  timeliness.  accuracy,  integrity,  privacy, 
security,  auditability,  ownership,  use,  ana  cost  effec¬ 
tiveness.  "Ref.  8] 


The  recommendations  of  the  N3S/ACK  wocushop  on  the  role  that 
the  data  dictionary  should  play  in  IR3  were  incorporated 
into  the  Federal  Information  Prgc easing  Standard  for  Data 
Dictionary  Systems  that  we  discussed  in  Chapter  III. 

In  order  to  understand  how  the  data  dictionary  contrib¬ 
utes  to  the  production  of  valuable  information  for  an  orga-^ 
nization,  we  will  look  more  closely  at  the  organization 
itself  and  at  its  functions.  An  organization  is  made  up  of 
many  systems  that  convert  resources  into  usable  output.  An 
information  system,  then,  is  one  that  takes  raw  data  and 
transforms  it  into  information  that  can  be  used  by  the  orga¬ 
nization.  If  the  process  by  which  the  organization  develops 
its  information  systems  is  the  heart  of  information  resource 
management,  then  it  is  the  data  dictionary  tnat  keeps  it 
ticking. 

Assume  that.  the  U.S.S.  Constellation  has  identified  a 
problem  with  the  way  a  particular  information  system  is 
currently  operating — it  could  be  preventive  maintenance 
re  cord- kee  ping,  the  supply  deparmtaewt  invent  or y,  the 
personnel  adninist ra tion  system,  or  a  system  that  affects 
the  entire  organization.  The  process  of  analyzing  the 


systei n  and  developing  a  system  to  solve  this  problem  evolves 
through  four  distinct  phases,  railed  tne  Sister  Development 
Life  Cycle  (3DLC)  .  He  will  show  how  the  data  dictionary 
supports  the  3DLC,  and  thus,  IEM,  througn  planning,  study, 
de s ign/codic g ,  and  operation  and  maintenance.  we  have  based 
ouc  analysis  of  the  FDLC  on  that  of  Leong-Hong  and  Pi  again 
[Ref.  9]. 

1 .  Planning  Phase 

The  Proceedings  of  the  NBB/AT.d  workshop  emphasized 
the  need  for  a  "top-down"  approach  to  IRM  in  an  organiza¬ 
tion.  During  the  planning  phase,  the  organization's  long- 
range  plans,  its  functions,  and  structure  are  analyzed  to 
ensure  that  any  information  system  that  is  developed  will 
complement  those  needs. 

If  a  data  dictionary  is  alceady  in  existence,  it  can 
pcovide  information  about  the  functions  of  the  organization 
that  have  beta  defined,  or  it  can  docuneat  the  initial  defi¬ 
nition  of  those  functions.  For  eaca  function,  it  must  be 
determined  who  does  it,  what  is  produced,  what  other  func¬ 
tions  it  interacts  with,  and  what  inputs  are  needed  to  acom- 
plish  the  function.  As  an  example,  we  can  say  of  the 
Payroll  function  that  it  is  performed  by  the  disbursing 
office,  paycnecks  and  leave  and  earnings  statements  are 
produced,  it  interacts  with  the  personnel  administration 
system,  and  it  reguires  data  about  all  members  of  the  crew, 
including  rank/rate,  time  in  service,  and  so  on. 

At  this  stage  of  the  development  process,  the  "big 
picture"  is  drawn  while  the  details  are  left  until  later. 
Thus,  general  categories  of  data  sucn  as  "accounting  data" 
and  "personnel  data"  and  the  transactions  that  affect  them 
ace  defined  and  entered  in  the  dictionary. 

In  the  aggregate,  this  planning  information  consti¬ 
tutes  a  conceptual  data  model.  "Definition  and  analysis  of 


33 


da  ta  - 


subsequent  information  requirements  (and  eventually, 
base  design)  will  be  dependent  upon  this  data  model" 
[Ref.  10].  The  fact  that  the  development  of  tiis  model  has 
been  automated,  rather  than  manual,  ensures  a  quicker,  stan¬ 
dardized  process. 

2  •  Study  Phase 

At  this  point  in  the  SDLC,  a  greater  level  of  detail 
is  introduced.  The  data  dictionary  provides  a  common,  stan¬ 
dardized  source  of  information  about  the  inputs  and  outputs 
of  the  organization’s  functions.  Specific  entities,  attri¬ 
butes,  and  relationships  ace  chosen  from  the  general  catego¬ 
ries  of  data  identified  in  the  planning  phase.  The  entity 
PAST  in  the  Constellation’s  inventory  system  may  be 
described  by  the  attributes  Navy  Part  Number,  Description, 
Storage  Location,  and  Quantity.  There  may  also  be  a  many- 
to-many  relationship  assigned  between  PART  and  DEPARTMENT. 
Peports  required  to  be  pcoduced  are  also  defined  and  the 
necessary  input  data  is  identified. 

This  information  provides  what  is  called  a  detailed 
conce£tual  model,  an  expansion  of  the  conceptual  model  of 
the  planning  phase.  The  data  dictionary  can  be  used  to 
identify  redundancy  within  the  data  model  by  determining 
whether  the  data  entered  already  exists.  In  addition,  with 
the  aid  of  the  dictionary,  the  systems  analyst  will  be 

able  to  determine  what  data  is  available,  how  it  is 
being  used.  how  it  can  be  accessed,  who  las  primary 
responsibility  for  its  definition  and  upkeep,  and  most 
Important,  whther  there  is  conflict  in  using  this  data, 
that  is,  what  impact  it  will  have  on  other  application 
systems  [  Ref.  11]. 


3. 


DGsig^n/ZodLnti  Phasa 

The  pacpose  of  the  design  phase  is  to  provide  speci¬ 
fications  for  programming  and  implementing  toe  system.  It 
is  here  that  the  data  dictionary's  schema  descriptors  will 
be  used  or  expanded  to  meet  the  needs  of  the  system.  If  a 
database  does  not  already  exist,  and  it  is  determined  that 
one  is  required,  the  data  dictionary  schema  will  provide  a 
basis  from  which  to  implement  one.  Data  Integrity  is 
enforced  because  the  dictionary  serves  as  the  sole  source  of 
data  definition  and  structure. 

When  software  is  being  coded,  the  data  dictionary 
provides  documentation  for  the  programmer  and  a  COPY 
facility  for  transporting  record  iefinitions,  for  example, 
into  the  program  being  developed.  An  important  element  of 
the  dictionary  is  the  constraints  that  are  defined  for  data 
values.  In  this  way,  data  that  is  input  to  a  program  can  be 
checked  against  the  constraints  that  nave  been  estabiisaed. 
Documentation  of  the  program  includes  the  author,  a  descrip¬ 
tion,  input  requirements,  output  produced,  and  information 
on  what  other  programs  are  called  upon,  all  of  which  are 
incorporated  into  the  data  dictionary. 

b.  Operation  and  Maintenance 

After  a  new  system  has  been  implemented,  the  work  of 
the  data  dictionary  does  not  end.  All  of  the  documentation 
that  has  been  recorded  during  the  development  of  the  system 
serves  as  a  base  of  reference  for  the  users  of  the  system. 
In  addition  to  the  database  administrator  and  the  adminis¬ 
trator  of  the  dictionary,  the  /\ay  players  La  information 
resource  management  who  benefit  from  the  use  of  a  data 
dictionary  fall  into  six  groups,  according  to  Alien,  Loomis, 
and  Mannino:  [Bef.  12] 


1.  The  lata  administ r at  or ,  who  Ls  responsible  for  the 
overall  administration  of  tae  data  resource,  uses 
the  dictionary  as  a  tool  to  enforce  the  way  data  is 
stored,  maintained,  and  monitored. 

2.  Data  processing  managers  oenefit  from  the  diction¬ 
ary's  reports  on  data  usage. 

3.  Operations  personnel  retrieve  information  from  the 
dictionary  about  jobs  that  are  being  run. 

4.  Programmers  and  analysts  use  the  dictionary  to 
retrieve  data  definitions  and  to  document  a  system 
being  developed. 

5.  End  users  access  the  data  dictionary  for  descrip¬ 


tions  of  their  dataviews. 


6.  rinaily,  auditors  will  use  the  documentation 
provided  by  the  data  dictionary  to  trace  data  and 
programs  as  they  are  used  in  tne  computer  system. 

It  is  the  process  of  implementing  a  data  dictionary  that 
we  nave  just  descri bed-- the  analysis  of  the  organization, 
the  definition  of  its  functions,  and  the  documentation  of 
its  information  systems — that  makes  the  dictionary  so  impor¬ 
tant  in  information  resource  management.  We  live  seen  that 
curing  the  development  of  an  information  system,  the  data 
dictionary  is  involved  from  the  initial  planning  stage, 
thcough  the  programming  process,  through  the  operation,  and 
into  the  maintenance  of  the  system.  The  dictionary  provides 
the  standards  for  data  which  will  be  used  taroughout  the 
life  of  the  system  and  referenced  when  developing  other 
systems.  Key  contributions  include  decreasing  the  amount  of 
redundancy  of  data  required  to  be  stoced,  enfoccing  security 
of  the  valuable  data  resource  through  access  controls  and 
imole mentation  of  user  views,  and  providing  documentation 
which  serves  as  a  "corporate  history"  and  as  a  reference 
upon  which  maintenance  and  auditing  are  based.  These  objec¬ 
tives  of  data  dictionary  usage  are  discussed  in  detail  in 
the  next  section. 


B.  OBJECTIVES  D  F  A  DATA  DICTIONARY 

In  this  section  wt  will  focus  on  the  three  major  contri¬ 
butions  of  the  lata  dictionary  to  tne  management  of  an  orga¬ 
nization's  data.  These  ace  data  security,  data  integrity, 
end  documentat  ion/accountability.  Althougn  >'  e  recognize 
that  other  objectives  of  data  dictionary  asa^e  night  be 
identified,  w3  beLieve  that  each  will  fall  into  one  of  these 
three  major  apings. 

1 •  Data  Security 

There  are  two  distinct  levels  of  security  of  the 
data  in  an  organization's  dataoase  whim  will  be  provide  1 
either  by  the  data  dictionary  or  by  the  database  management 
system  itself.  First,  procedures  should  exist  to  ensure 
that  only  authorized  personnel  are  allowed  to  access  the 
information  contained  within  the  database.  The  widespread 
use  of  computers  and  the  increasiag  sophistication  of  users 
has  made  an  organization's  data  vulnerable  to  embezzlers, 
amateur  "hackers",  corporate  spies,  and  careless  employees. 
Second,  the  system  should  contain  provisions  for  controlling 
the  amount  and  types- of  data  that  each  authorized  user  is 
allowed  to  access  within  the  system.  Some  of  the  sophisti¬ 
cated  data  dictionaries,  for  exampLe,  include  a  trace  mecha¬ 
nism  which  increases  security  by  recording  every  inquiry 
that  is  made  into  system  files  and  data.  If  an  intrusion  is 
male  into  tne  system  by  unauthorized  personnel,  the 
specifics  of  that  inquiry,  including  tne  data  which  was 
accessed,  will  be  recorded. 

Metadata  should  be  affocded  at  least  the  same 
protection,  if  not  more,  than  the  data  in  the  database. 
Leong-Kong  and  Plagman  [Ref.  13]  present  an  example  of  the 
importance  of  the  security  of  metadata  as  it  concerns 


;r 


o 


a 

M 


the  data  r&33ur:35  ir.  intelligence/militarv  aopiications 
such  as  the  classification  coda  of  intellijar.ee  docu¬ 
ments.  When  secaritv  profiles  far  the  metadata  entities 
are  stored  in  the  net  a  1  a t alast ,  ana utn orize 1  access  to 
the  metadata  could  be  mast  damaging.  This  is  Lecause 
Dcesurablv  one  would  Le  able  to  'crash  into'  tne  system 
usi  r.g  that,  i  nf  or  a  a  tion. 

It  is  the  task  of  the  dictionary  administrator  to  analyze 
the  metadata  to  determine  tne  levels  of  security  required 
ana  to  grant  access  privileges  (read  and  write,  read  only, 
update)  to  users  for  certain  portions  of  tie  metadata. 
Information  about  users,  their  password,  and  privileges  is 
stored  in.  the  data  dictionary  and  is  accessible  only  to 
personnel  authorized  by  the  administrator. 

We  have  already  shown  in  Figure  2.3  that  subschemas 
contribute  to  security  nv  limiting  tne  size  of  tne  "window" 
through  which  a  database  user  loots  at  data.  When  a  user 
attempts  to  access  a  particular  subschema,  the  request  is 
routed  through  the  data  dictionary  to  determine  whether 
access  is  authorized  and,  if  so,  the  structure  of  the 
subschema.  Dnly  at  this  point  is  the  "real"  data  in  the 
database  accessed. 

2.  Data  Integrity 

The  keys  to  data  integrity  are  the  control  of  inputs 
to  the  database  and  the  minimization  of  data  duplication. 
Praperly  used,  these  keys  will  enhance  communication  between 
users  by  ensuring  that  a  single,  correct  source  of  data  is 
maint  ained. 

Because  the  data  in  a  database  is  shared  among  many 
users,  it  is  essential  to  have  some  means  of  eaforcing  stan¬ 
dards  for  entering  data,  updating  it,  and  maintaining  it. 
For  example,  the  data  dictionary  iientifies  constraints,  or 
limitations  on  the  values  data  can  have.  Fields  can  be 
defined  as  being  mandatory  or  optionaL,  alpaanumeric  or 
numeric,  and  a  minimum  or  maximum  length.  The  data 


33 


dictionary  contains  comments  on  how  data  snould  be  used  in 
orier  to  assist  those  using  the  data  dictionary.  Another 
important  control  feature  or  a  data  dictionary  is  how  it 
deals  with  syaonyis--an  entity  or  attribute  with  more  than 
one  name.  For  instance,  the  entities  EMPLOYEE, 
F.E3rCNALJiAKA3E3 ,  and  EXEI'JTIVE  may  all  be  used  cy  different 
departments  in  the  organization  to  refer  to  linda  Smith. 
The  administrator  must  standardize  tae  terminology  used  in 
tne  organization  and  eliminate  as  many  synonyms  as  possible. 
Then  this  is  not  feasirle,  all  of  these  synonyms,  or 
aliases,  must  ir  recorded  ir.  "he  lata  dictionary.  Van  luyn 
[Ref.  Id]  ex  pi  air.  ;  t:  .*■ 


It  is  not  unusaal  "e  n-.v_-  similar  types  of  lata  elements 
m  the  d  a  t  a  d  a  s  and  ir.  various  applications.  In  such 
cases,  and  ir.  cases  wi.r.-re  tne  came  data  type  is  known  by- 
other  names,  tn-e  IFF  'data  dictionary]  can  be  used  to 
inform  the  lsers  of  the  relationships  that  exist  am  one 
these  data  and  of  the  disposition  or  their  usage.  In 
other  words,  tne  IFF  provides  information  as  to  which 
nodules/programs  and  systems  use  the  same  data  type  and 
now  t.mey  relate. 


The  lata  dictionary  also  contributes  to  data  integ¬ 
rity  because  it  reduces  tae  necessity  for  duplication  of 
data  and  therefore  lessens  the  opportunities  for  error.  The 
information  aoout  the  components  of  different  subschemas  of 
the  same  logical  view  is  stored  in  the  data  dictionary  in 
place  of  the  data  itself.  A  user,  whether  writing  a  program 
or  creating  a  new  entity-type,  should  be  able  to  query  the 
data  dictionary  to  ensure  that  the  necessary  routines  or 
entities  do  not  already  exist  within  tae  system.  Perhaps 


one  of  the  most  important  benefits  of  DOS  "data  diction¬ 
aries]  is  that  because  it  gives  accurate  and  timelv 
information,  management  can  control  more  efficiently  not 
only  the  automated  and  manual  data  of  the  enterprise  but 
all  its  resources  and  operations.  Ion  sequent! v*  manage¬ 
ment  is  provided  with  precise  and  accurate* cata  for 
guick,  profitable  decision-making  [Ref.  15]. 


This,  the  possibility  of  two  users  querying  the  database  and 
receiving  different  answers  to  the  same  question  at  the  same 
time  is  decreased. 

3.  Docunentation/Mainte nance 

Eecause  maintenance  is  the  most  expensive  and  time- 
consuming  phase  of  software  development,  documentation  and 
maintenance  of  the  organization's  data  is  probably  the  most 
significant  objective  of  the  data  dictionary.  It  is  a  fact 
of  software  life  that  documentation  is  often  avoided  during 
system  development  and  program  design.  To  a  large  extent, 
this  is  because  documentation  can  be  prepared  as  an  "after¬ 
thought";  it  is  not  essential  to  the  operation  of  the 
system.  But  when  a  system  is  developed  that  includes  a  data 
dictionary  from  the  beginning,  the  data  which  is  required  by 
the  data  dictionary  forces  documentation  to  become  an  inte¬ 
gral  part  of  the  design.  "The  use  of  a  dictionary  provides 
documentation  of  a  quality  and  form  that  is  simply  not 
available  through  less  formalized  procedures  in  the  data 
processing  environment"  [  Baf .  16]. 

The  data  dictionary  can  also  reduce  the  amount  of 
effort  required  by  maintenance  personnel  because  it  provides 
"a  'roadmap*  for  the  programmer  doing  maintenance.  It 
records  the  programs  being  maintained,  their  data  structures 
and  their  relationships"  [  Bef.  17].  Se  have  defined  an 
active  data  dictionary  as  one  in  which  information  is 
created,  accessed,  or  modified  tnrough  the  data  dictionary 
interfaces  with  new  or  changed  metadata  automatically  stored 
in  the  data  dictionary.  This  "continuous  maintenance"  can 
be  used  to  allow  the  database  administrator  to  monitor  where 
data  is  used,  who  uses  it,  how  often  it  is  used,  and  what 
changes  have  been  made  to  it.  Because  the  data  dictionary 
provides  a  wealth  of  documentation,  it  is  possible  to  trace 
an  "audit  trail"  through  the  organization's  data,  from  user 


names  and  department  to  the  kind  or  lata  use!  in  a  program 
to  how  many  records  a  certain  field  appears  in.  Also, 

The  tracking  of  how  programs/modules  use  particular  data 
as  well  as  which  files /segments  contain  certain  data  is 
extremely  important  to  the  systems  analyst  in  performinq 
system  changes.  Through  the  DD3  'data  dictionary],  he 
or  she  is  abLe  to  ascertain  what  impact  the  proposed 
manges  will  have  on  other  components  of  tne  system  and 
upon  functional  areas  within  the  enterprise.  By  having 
an  accurate,  up-to-date  assessment  of  the  Location  an  a 
usage  of  data  that  wiLl  be  involved  in  the  system 
change,  the  analyst  can  accomplish  the  task  more 
efficiently  [fief.  18]. 

Once  an  organization  has  decided  to  make  a  commit¬ 
ment  to  manage  its  data  using  a  data  dictionary,  it  must 
decide  what  kind  of  data  dictionary  oast  suits  its  partic¬ 
ular  needs.  In  the  next  section,  we  will  look  at  the 
features  of  mat  we  have  called  tie  "ideal  data  dictionary" 
as  a  basis  for  evaluating  the  many  commercially  available 
data  dictionary  packages  from  which  the  organization  must 
choose. 

C.  THE  IDEAL  DATA  DICTIOHABI 

Having  identified  the  functions  of  a  data  dictionary  in 
Chapter  II  and  how  they  support  tne  iccompiisa ment  of  the 
objectives  just  discussed,  it  wilL  be  helpful  to  use  these 
concepts  to  evaluate  data  dictionaries.  The  "ideal"  data 
diction ary  would  be  one  that  possesses  all  the  capabilities 
necessary  to  support  all  potential  users  in  all  possible 
applications.  However,  this  ideal  dictionary  would  be 
impossible  to  conceptualize,  much  Less  to  create.  The  ideal 
data  dictionary  for  an  organization  wiLl  depend  on  the  orga¬ 
nization's  size,  functions,  and  needs.  The  potential  users 
of  a  dictionary  will  have  to  develop  a  set  of  criteria  upon 
which  a  candidate  will  be  judged. 


d  at  a 


k 


k 

I. 


Many  references  provide  criteria  for  evaluating 
dictionaries  and  identify  those  characteristics  which  are 
vital  to  the  management  of  the  data  resource. 
Unfortunately,  it  is  difficult  to  find  two  references  that 
propose  the  same  criteria.  One  excellent  source,  L  eong-Hcng 
and  Plagman  'Sef.  19],  lists  nine  categories  for  evaluation: 

1.  data  description  facility 

2.  data  documentation  support 

3.  metadata  generation 

4.  security  support 

5.  integrity  support 
.  user  interface 

7.  ease  of  use 

8.  resource  utilization 

9.  vendoc  support 

It  is  important  to  recognize  a  distinction  between  two 
categories  of  criteria  for  the  ideal  data  dictionary:  those 
that  evaluate  the  vendor  and  operating  environment,  and 
those  that  evaluate  the  data  dictionary  itself.  In  the 
focmer  category,  items  like  vendor  support  and  reliability, 
the  choice  between  free-standing  or  DBMS-dependent  data 
dictionaries,  the  degree  of  integration  with  other  system 
components,  and  the  quality  of  system  documentation  are 
important  considerations  that  may  drive  the  decision  between 
two  comparable  data  dictionaries.  It  is,  aowever,  the 
latter  type  of  criterion  that  wilL  be  vital  in  identifica¬ 
tion  of  the  essential  requirements  of  the  ideal  data 
dictionary.  We  have  grouped  all  such  requirements  into  six 
categories:  system  standard  schema  and  extensibility, 

command  and  query  languages,  ease  of  use  (including  menus), 
security,  documentation  and  reports,  and  application  inter¬ 
faces.  (We  aave  assumed  that  the  objective  of  data  integ¬ 
rity  will  be  accomplished  by  the  correct,  and  enforced,  use 
of  any  data  dictionary.)  If  a  particular  dictionary  fully 


supports  each  of  these  six  criteria  than  it  will  most  lively 
tr.aat  ail  of  the  organization's  data  management  needs. 

1 •  System  Standard  Schema  and  Extensibility 

The  ideal  data  dictionary  must  provide  a  system 
standard  schema  with  all  the  descriptors  aecassary  to 
support  the  range  of  applications  required  by  the  organiza¬ 
tion  while  still  being  simple  enougn  to  be  competitively 
priced.  It  must  provide  "enough"  descriptors  to  be  fully 
capable  without  providing  so  many  that  the  schema  becomes 
confusing.  Mditionally,  the  ideal  dictionary  must  support 
the  user  (or  data  dictionary  administrator)  in  modifying 
existing  schema  descriptors  and  creating  new  entities,  rela¬ 
tionships,  and  attributes.  This  extensibility  is  vital  in 
supporting  applications  specific  to  the  organization's 
needs. 

2.  Command  and  Query  Languages 

The  ideal  dictionary  must  provide  botn  command  and 
query  languages.  The  command  language  must  support  creation 
and  modification  of  data  structures  and  subsequent  entry  of 
data  into  those  structures.  Tie  command  language  must 
include  edit  commands  to  facilitate  addition,  modification, 
and  deletion  of  system  data.  It  should  include  commands 
restricted  to  use  by  the  data  dictionary  administrator, 
e. g. ,  password  assignment.  The  ideal  system  will  include  a 
query  language  to  support  the  analysis  and  production  of 
usable  information  from  the  organization's  data.  Perhaps 
one  of  the  most  important  features  of  a  data  dictionary  (and 
database),  query  languages  allow  data  to  be  screened  in 
order  to  provide  concise  and  specific  information  to  support 
timely  manageaent  decisions. 


3 


It  must  be 


taut  aspect  of  the  ideal  data  dictionary, 
supportive  of  new  users  while  still  providing  full  func¬ 
tional  support  of  the  system  "experts".  Two  primary  ingre¬ 
dients  of  use  r- f  r  ier.dlin  es  s  are  the  availability  of  menus 
and  carefully  conceived  examples  in  the  dictionary's  ref  er - 
ence  manuals.  A  hierarchy  of  menus  can  reduce  complex  oper¬ 
ations  to  a  series  of  smaller,  friendlier  steps  while  user 
documentation  provides  eas y-to-uaderstand  examples  that 
guide  the  inexperienced  user  through  each  phase  of  system 
operation.  As  microcomputers  and  the  concept  of  the  auto¬ 
mated  office  continue  to  spread,  ease  of  use  will  become  an 
even  more  important  consideration  in  deciding  which  software 
products  to  utilize. 

4.  Security 

Security  will  be  a  vital  concern  of  the  ideal  data 
dictionary.  Protection  and  control  of  system  information 
must  be  provided.  The  data  dictionary  administrator  must  be 
pcovided  the  capability  to  control  personnel  access  to 
system  data.  fle  or  she  must  also  be  aole  to  grant  different 
degrees  of  access  to  different  users.  Similarly,  users 
should  have  the  capabilities  to  protect,  and  grant  access 
to,  those  structures  and  data  which  tney  control. 

5-  Documentation  and  Reports 

The  documentation  and  repocts  created  by  the  ideal 
data  dictionary  must  also  be  clear  and  understandable. 
Timely  and  accurate  preparation  of  reports  is  a  Key  objec¬ 
tive  of  any  D3MS.  The  data  dictionary  is  uniguely  gualified 
to  assist  with  this  function.  By  ensuring  the  integrity  of 
data  accessed  and  supporting  guery  commands,  the  ideal  data 


dictionary  can  provide  reports  ani  documentation  to  answer 
specific  questions  as  they  arise. 


i 

f 


t 

m 


t 


L- 


6.  4E£li2ition  Interfaces 

The  final  important  characteristic  of  tne  ideal  data 
dictionary  is  its  aDiiity  to  interface  with  the  other  appli¬ 
cations  that  may  exist  in  the  organization.  If  the  data 
dictionary  is  f r ee- s tanding ,  it  should  interface  with  many 
of  the  currently  available  database  management  systems.  If 
DBIS-dependent,  the  dictionary  saould  interface  with  ail 
components  of  that  system.  Additionally,  the  ideal  data 
dictionary  should  interface  with  code  generators,  communica¬ 
tion  systems,  and  other  agents  of  the  users*  environment. 

In  the  following  chapter,  we  will  study  and  evaluate 
four  of  the  popular  data  dictionaries  that  are  currently 
available.  ffe  will  use  these  chi racteristics  of  the  ideal 
data  dictionary  that  we  have  defined  to  compare  and  contrast 
the  features  of  the  four  dictionaries.  In  addition,  each 
will  be  compared  to  "standard"  dictionary  presented  in  the 
FIPS  DDS . 


T  '  T 


V.  EVALUATION  OF  COMMERCIAL  DATA  DICTIONARIES 

The  purpose  of  this  chapter  is  to  review  and  evaluate  a 
cross-section  of  commercial  data  dictionary  packages.  e 
se  Lee  ted  four  dictionaries:  DATA  DESIGNER,  DAT  AM  Ah'  AGE?  , 

ORACLE,  and  DAT  ADI CT ION  ARY.  User  documentation  and  library 
sources  were  the  primary  sources  of  information  for  our 
evaluation.  Additionally,  ORACLE  was  available  on  the  Naval 
Postgraduate  School’s  Vax  minicomputer,  and  we  observed 
demonstrations  of  DATA  DESIGNER  and  DA  TADIC! ID MARI. 

A.  DATA  DESIGNER 

DATA  DESIGNER  is  a  free-standing  data  dictionary  devel¬ 
oped  by  Database  Design,  Inc.  It  was  introduced  in  1S75 
with  the  goal  of  supporting  logical  database  design  oy 
soLving  some  of  the  traditional  proDiems  associated  with 
muLiple- application  database  management  systems,  such  as 
duolication  of  data,  excessive  storage  requirements,  data 
consistency,  complexity,  and  modifiability.  DATA  DESIGNEE 
caa  be  used  in  conjunction  with  a  variety  of  database 
management  systems,  including  IMS,  IDMS,  ADABA3,  NOMAD,  and 
others.  Additionally,  it  can  produce  designs  that  will 
interface  wita  COBOL  and  other  non-DBMS  tools  or  systems. 

DATA  DESIGNER  can  be  characterized  as  an 

automated,  easy-to-use  tool  that  assists  the  database 
designer  in  formulating  normalized  views  of  the  data 
requirements  and  synthesizes  these  views  into  a  canon¬ 
ical  normalized  form.  .  .  .  DATA  DESIGNS3  maintains 

information  needed  to  physically  structure  tie  database 
for  efficient  performance  [Ref.  20]. 

In  addition  to  providing  the  standard  functions  of  a  data 
dictionary,  DATA  DESIGNER  goes  several  steps  beyond.  It 


46 


provides  an  extensive  set  of  commands  categorized  as  user 
commands,  edit  commands,  and  plotting  commands,  as  shown  an 
Taole  1.  It  also  supports  limited  production  of  models  and 
graphics.  Furthermore,  9414  DESIGNER'S  capabilities  include 
powerful  generation  options  and  report  features  that  will 
support  the  design  and  maintenance  of  applications. 


TABLE  1 

Standard 

Commands  of  DATA 

DESIGNER 

Oser  Commands 

ADD 

BATCH 

BUILD 

COPY 

CREATE 

e  m  ?  r  v 

END 

FILES 

GENERATE 

HELP 

HIERARCHY 

PLOT 

PRINT 

RENAME 

REPORT 

SHOW  OPTIONS 

TRANSFER 

VALIDATE 

Edit  Commands 

DELETE 

EDIT 

INSERT 

LIST 

RENUMBER 

REPLACE 

Plotting  Commands 

DRAW 

DONE 

RETURN 

SET  ALT 

SET  DEVICE 

SET  RANGE 

SET  TITLE 

SET  TYPE 

SHOW 

DATA  DESIGNER  supports  logical  database  design  through  a 
five-step  process: 

1.  A  data  dictionary  file  is  created  that  contains  a 
list  of  all  standard  data  item  names  to  be  used. 

.  Subscaema  files  are  created  that  describe  all  of  the 
views  necessary  to  support  user  data  requirements. 


2 


3.  The  encoded  user  views  are  validated.  This  step 
verifies  the  syntax  of  each  view  and  ensures  that 
each  data  item  name  listed  in  a  view  is  in  the  data 
dictionary . 

4.  All  of  the  verified  user  views  are  synthesized  into 
a  logical  data  model.  Reports  and  diagrams  are 
generated  to  reflect  this  model. 

5.  The  model  is  evaluated  to  ensure  that  it  meets  all 
user  requirements  and  is  modified  as  recessary  by 
repeating  steps  ( 1  >  througi  {4|. 

DATA  DESIGNER  utilizes  three  kinds  of  files:  dictionary 

files,  subschema  files,  and  generated  design  files.  A 
dictionary  file  ($DIC)  contains  a  list  of  all  data  elements 
that  will  be  used  in  an  application  or  subschema.  This  list 
serves  as  a  base  for  further  development,  e.g.,  additional 
views.  A  subschema  file  ($SUB)  contains  data  items  and 
relationships  pertaining  to  particular  views.  Finally,  the 
generated  design  file  ($DES)  contains  a  logical  data  model 
geaerated  by  DATA  DESIGNER  using  the  applicable  dictionary 
and  subschema  files  as  input.  The  generated  design  files, 
in  turn,  serve  as  the  input  foe  the  report  and  graphics 
functions. 

Key  commands  utilized  during  the  creation  of  a  logical 
database  design  include  the  following; 

CREATE--ief ines  dictionary  and  subschema  files. 

BUILD--enters  data  item  names  into  created  files. 

VALIDATE- - compares  the  subschema  files  to  the 
dictionary  file. 

GENER ATE--creates  a  logical  D3  design  from  the 
validated  files. 

REPORT--pr epares  design  documentation  for  the 
logical  design. 

PLOT — uses  the  olotting  subsystem  to  draw  the 
logical  design. 

EDIT--supports  modification  of  existing  files  when 
nece  ssar  y. 


8 


In  order  to  acquaint  the  reader  with  the  operation  of 
DATA  DESIGNEE,  we  will  demonstrate  the  dialog  associated 
vita  each  step  of  the  process  necessary  to  create  our  Naval 
Postgraduate  School  database  example  of  Chapter  II.  The 
user  of  DATA  DESIGNEE  must  first  create  the  dictionary  file 
STUDENT. DIC  and  the  subschema  file  STUDENT. SU3  (user  inputs 
are  indicated  by  boldface  type)  as  follows: 


>CREATE  STUDENT. DIC  DICTIONARY 
DDFC0101I  Pile  " STUDENT. DIC"  of  type  "5DIC"  created. 

>C REATE  STUDENT. SOB  SUBSCHEMA 
D DFC0203I  File  " STUDENT.  3U 3"  of  type  '»3S'J3"  created. 


Next,  the  B'JILD  command  is  used  to  load  data  items  into  the 
two  created  files.  First  all  possible  data  items  are  listed 
in  the  dictionary  file: 


>BUILD  STUDENT. DIC 
DDBS0065I  The  file  type  is  SDIC. 

DDBSOO 181  There  are  no  records  in  the  file. 

B>NAHE 

B>SSN 

B>SERVICE 

B>RANK 

B>DONE 

DDBS0064 I  File  building  is  dona. 

DDBS0068I  4  records  wece  entered 

DDRN0098I  Line  1100  is  now  the  last  line  ia  your  file. 


The  subschema  file  will  support  creation  of  one  or  more  user 
views.  In  our  example,  the  suoscaema  file  contains  two 
views,  the  basic,  overall  view  and  the  view  intended  for 
Army  use  only.  Notice  that  after  the  user  enters  the  EUILD 
process,  eaca  line  must  start  wita  a  modeling  code.  These 
codes  are  used  to  identify  components  and  to  establish  rela¬ 
tionships  within  the  views.  When  Duiiding  the  subschema 
files,  all  desired  relationships  must  be  specifically 
stated.  DATA  DESIGNER  uses  ”1"  to  specify  i  or.e-to-ooe 
relationship  and  " M "  for  a  one-to-many  relationship.  A 
complete  list  of  the  modeling  codes  used  in  this  example 
appears  in  Table  2. 


4  9 


>BUILD  STUDENT. SOB 

DDBS0075I  The  file  type  is  $SJ3. 

DDB50081I  Taere  are  no  records  in  the  til?. 

**Itf;!;*Zl  ****************  ************ 

*  THIS  VIEW  S0PP3FTS  THE  OVERALL  VIEW  * 
****************************************** 

B>Ff 0 100 
B>T,0003 
B>K, SSN 
B> 1 , NAHE 
B>1 , SERVICE 
B> 1 , RANK 

***111**1®******************************** 

*  THIS  VIEW  SOPPORTS  THE  A  R  M  if  VERSION  * 
****************************************** 
B>F,0125 

B>T,0002 
B>K,SSN 
8> 1 , NAME 
B>1 -RANK 

****************************************** 

B>DOHE 

DDBS0064I  File  building  is  done. 

DBBS0068I  13  records  were  entered 


TABLE  2 

DATA  DESIGNER  Modeling  Codes 
Code  Modeling  Use 


Name  a  user  view 
Specify  frequency  of  use 
Specify  reg’d  response  time 
Name  a  key 

Concatenate  keys  and  data 

Concatenate  keys  in  short  way 

Label  a  data  group 

Identify  a  multiple  association 

Identify  a  single  association 

Name  an  association 

Insert  comments 


Cnee  the  dictionary  and  subschema  files  are  formatted, 
the  VALIDATE  command  is  used  to  ensure  that  aLL  entries  and 
relationships  in  the  subschema  files  ace  valid  based  on  the 
information  previously  specified  in  the  dictionary  file. 


50 


DATA  DESIGNEE  will  respond  witi  the  number  of  views 
processed,  the  number  of  lines  real,  and  the  number  of  vali¬ 
dation  errors,  if  any,  that  were  located: 

>V ALID ATE  STUDENT. SOB  STUDENT. DID 

DDV50013I  Validation  begins. 

DDVS0024I  2  Views  were  processed. 

DDVS0025I  13  lines  were  read. 

DDVS0015I  0  validation  errors  were  detected. 

Cnee  the  files  are  successfully  /alidated,  the  user  will 
utilize  the  subschemas  to  generate  a  logical  database  design 
for  his  or  her  application. 

The  ten  GENERATE  options  from  waich  the  user  can  choose 
are  powerful  features  that  allow  tne  user  to  control  the  way 
that  DATA  DESIGNER  produces  a  design  and  supports  requests 
for  varying  degrees  of  information  during  the  generation 
process.  If  the  GENERATE  command  is  called  witiout  options, 
DATA  DESIGNER  will  create  a  design  that  removes  all  redun¬ 
dant  data  elements,  generates  intersection  data  groups  as 
necessary  to  resolve  many-to-many  relationships,  suppresses 
repeating  data  elements  within  data  groups,  generates  single 
key  data  groups  from  concatenated  keys,  and  considers  all 
frequency  and  timing  information  that  was  contained  in  the 
subschema  files.  In  all  cases,  the  end  product  of  the 
GENERATE  command  will  be  creation  of  a  £DE3  file,  in  this 
case,  ETUDE NT. DES.  The  factory  usee's  guide  recommends  that 
options  4,  5,  5,  9,  and  10  be  used  when  generating  the 

initial  design  or  after  major  revisions  to  the  input  fiies. 
A  brief  description  of  each  generate  option  is  shown  in 
Table  3.  Continuing  with  our  student  database  example,  the 
user's  dialog  will  be 

>GENERATE  OPTION  4  5  5  9  ID  D  STUDENT. DESI SN 

DDGS0032I  Design  generation  begins. 

DDGS0058I  The  subschema  file  is  STUDENT. 3U9 

DDGS0214I  Option  4  ignores  undefined  links. 

DDG50281I  Option  5  generates  foreign  key  information. 

DDGS0301I  Option  6  generates  candidate  itey  information. 

DDGS0307I  Option  9  generates  cross-reference  info. 

DDGS0307I  Option  10  ignores  frequency  and  timing  info. 

DDGS0054I  Design  generation  has  finished. 


TABLE  3 

DATA  DESIGSER  Generate  Dptions 


0  pt ion 


Purpose 


Generate  unspecified  associations. 
Suppress  resolving  redundant  data. 
Suppress  creating  intersection  files. 
Supress  generating  inverse  links. 
Generate  foreign  xey  information. 
Generate  candidate  key  inf ornati oa. 
Allow  repeating  data  items  in  grouDs. 
Suppress  generating  single  xey  groups. 
Generates  cros s- refer enc e  information. 
Suporess  freguency /timing  information. 


At  this  point,  the  logical  dataoase  design  is  completed. 
When  using  the  options  specified  in  tae  example,  a  series  of 
reports  will  be  automatically  generated.  A  list  of  reports 


TABLE  4 

Reports  Available  with  DATA  DESIGNER 


Report  Type 


Data  Group  Links  Report 

Ganonical  Schema  Report 

Data  Group  Index  Report 

.Multiple  Occurences  of  Data  Items. 

Data  Relation  Report 

Data  Group  Candiates  Keys  Report 

m  i  -»  T  *•  l  A  ft 


Data 
Data 
D  a  ta 
User 
Data 


Item  to  User  View  _r oss-Eef ecence 
View  to  Data  Group  Dross-Reference 
Group  to  User  View  G ross-Ref e rence 


created  is  contained  in  Iable  4. 
the  user's  dialog  will  simply  be 


To  print  these  reports 


>REPORT  123456789  PRINTER  P  R  D  M  STUDENT. DES IG  N 

DDP20073I  The  reports  were  printed. 

As  a  final  aid  in  evaluation  of  the  logical  database 
design,  DATA  DESIGNER  is  capable  of  producing  diagrams  of 
(1i  an  overview  of  the  logical  database  design  and/or  (2)  a 
hierarchical  representation  of  that  logical  design.  To 
produce  the  logical  overview  diagram,  the  following  dialog 
is  required: 

>PLOT 

DDPT0289I  DATA  DESIGNER  Print  Plot  Release  2.5A 

P> SET  TYPE  OVERVIEW 
P> SET  TITLE  LOGICAL-DESIGN 
P>DRA  W  FROM  S TO  DENT. DES IG N 

DDFS0310I  Design  STU  D  ENI .  D ES  I G N  '  s  description  loaded. 

DDNX0271I  The  overview  plot  generation  is  done. 

P>RETURN 
P>  END 

After  using  the  printed  reports  and  diagrams  to  evaluate  the 
database  design,  the  user  will,  if  satisfied,  transcribe  the 
design  into  a  specific  DBMS  format,  sucn  as  ADABAS,  or  use 
DATA  DESIGNEE’ s  EDIT  capabilities  to  revise  tie  design  as 
necessary. 

As  discussed  in  Chapter  IV,  data  dictionaries  car.  be 
evaluated  on  the  basis  of  their  accomplishment  of  security, 
integrity,  and  documentation/maintenaace.  DATA  DESIGNER,  as 
a  free-standing  data  dictionary  taat  can  be  used  in  conjunc¬ 
tion  with  a  variety  of  DENS  and  non-D3NS  systems,  does  not 
address  the  security  aspect.  It  was  apparently  designed 
with  the  assumption  that  the  parent  system  with  which  DATA 
DESIGNER  interacts  will  handle  access  control  and  other 
security-related  functions. 

DATA  DESIGNEE  does,  however,  receive  higa  marks  for 
maintaining  data  integrity  and  for  the  guality  of  its  docu¬ 
mentation.  Because  it  is  designed  to  support  the  develop¬ 
ment  of  logical  database  designs,  it  utilizes  its  dictionary 
files  to  ensure  that  duplication  of  data  ls  prevented 
through  generation  of  cross-reference  files.  When  a 
sunscheua  is  modified.  DATA  DESIGNER  main  utilizes  its 


dictionary  files  in  the  subsequent  design  generations.  The 
PLDI  and  PEPDRT  functions  provide  a  wealth  of  information 
about  the  design,  its  components,  and  ail  users  of  the 
suoschetnas.  Relationships,  both  those  included  by  the  user 
and  those  produced  by  DATA  DESIGNER,  can  be  seen  in  written 
reports  and  visual  representations.  if  hen  modi  f  icatons  and 
new  designs  are  produced,  the  reports  are  automatically 
updated  to  reflect  all  changes. 

B.  MSP  DATAMANAGER 

DATAMANAGER,  developed  by  M5P,  INC.  of  Lexington, 
Massachusetts,  is  one  member  of  the  MANAGER  family  of 
dictionary-oriented  software  products.  Dther  products 

include  DESIGNM A NA3ER,  PRO JECTMAN AGER ,  SODRCEM ANAGER,  and 
TESTMAUAGEP.  The  entire  line  of  products,  while  capable  of 
batch  operations,  is  designed  specifically  to  support  inter¬ 
active  operations  with  IBM  360/3 70/30xx/4300  series  (and 
plug  compatible)  computers.  While  D AI AM ANAGER  is  designed 
as  a  nucleus  for  further  expansion  or  specialization,  it 
provides  all  basic  capabilites  necessary  to  create  and  main¬ 
tain  user  dictionaries.  Additional  capabilites,  available 
as  a  series  of  extra-cost,  add-on  modules,  include: 

1.  interfaces  to  IDMS,  ADABAS,  IMS,  TOTAL,  SYSTEM  2000, 
and  other  DBMS 

2.  teleprocessing  interfaces 

3.  generation  of  COEDL,  PL/I,  or  other  source  language 
data  descriptions 

4.  generation  of  DATAMANAGER  data  definitions  from 
existing  COBOL  or  PL/I  source  code 

5.  interfacing  of  a  DATAMANAGER  dictionary  to  user- 
written  programs 

6.  status,  audit,  and  security  facilities 

7.  extensibility  through  a  user-defined  syntax  facility 


DAT  A MANAGER  can  provide  data  dictionary  capabilities  to 
users  utilizing  a  variety  of  hardware/software  combinations. 
By  providing  interface  modules  foe  several  popular  database 
management  systems,  DATAMANA3ER  is  obviously  more  flexible 
than  one  that  is  tied  to  a  single,  distinct  database  system. 
However,  DATAMANAGER* s  flexibility  extends  beyond  the 
obvious: 


DATAMANAGER  is  intended  for  use  in  any  organization  in 
which  there  is  a  computerized  data  processing  function. 
Its  use,  however.  is  not  confined  to  those  elements  of 
data  that  are  held  in  computer  files  or  that  are  acted 
upon  by  computerized  systems.  Definitions  of  all  data 
held  and  used  by  an  organization,  in  its  manual  systems 
as  well  as  its  computerized  ones,  can  be  held  in  a 
DATAMANAGER  data  dicionary.  DATAMANAGER  is  designed  to 
be  used  both  with  traditional  files,  powerful  database 
systems,  and  in  a  nixed  environment.  Use  of  the  data 
dictionary  remains  independent  of  the  database  manage¬ 
ment  system,  although  further  add-on  facilities  enable 
DATAMANAGER  data  definitions  to  be  generated  directly 
from  the  database  data  description  language  source 
coding.  [ Bef .  21  ] 


The  architecture,  or  structure,  of  the  DATAMANAGER  data 
dictionary  is  composed  of  four  (or  five)  data  files,  called 
data  sets  in  the  user  documentation. 

The  source  data  set  contains  the  data  definitions  as 
ociginally  input  into  the  system  by  the  user.  nhen  the  user 
modifies  or  appends  changes,  the  data  definitions  are  auto¬ 
matically  updated  within  the  file. 

The  data  entries  data  set  contains  all  encoded  data 
definitions  generated  by  DATAMANAGER  after  evaluating  the 
contents  of  the  source  data  set.  Data  definitions  are 
encoded  to  reduce  the  time  reguiced  for  DATAMANAGER  to 
process  the  information  within  the  data  dictionary.  Ducing 
this  encoding  process,  relationships,  aliases,  and  classifi¬ 
cations  are  also  identified. 

The  index  data  set  is  an  automated  index  containing  the 
name  and  addeess  of  each  entity  definition  that  is  in  the 
source  data  or  data  entries  data  sets.  The  index  data  set 


serves  as  a  lata  direstory  to  support  the  fastest  possible 
retrieval  of  entity  definitions  and  associated  data. 

The  error  recovery  data  set  is  used  by  the  system  as  a 
temporary  backup  storage  file.  This  capability  was  imple¬ 
mented  to  increase  reliability  by  providing  for  automatic 
recovery  of  the  dictionary  contents  in  the  case  of  external 
interruption  or  other  system  failure  luring  a  dictionary 
update. 

The  log.  data  set  is  an  optional  capability  that  is 
highly  recommended  by  MSP.  All  updating  commands,  associ¬ 
ated  data  definitions,  and  amendments  are  logged  into  that 
file  as  they  occur.  Entries  include  command  identification, 
full  date,  time,  user,  and  status  of  all  physical  input/ 
output  accesses.  Additionally,  the  data  administrator  has 
the  option  of  specifying  that  all  commands  directed  to  the 
data  dictionary  be  logged.  When  combined  with  other  system 
backup  facilities,  this  allows  DATAMANAGEE  to  be  "rolled” 
forward  from  the  last  backup  point  in  case  fuLL  recovery  is 
ever  required. 

DATAMANAGER  is  a  powerful  system  that  utilizes  a  series 
of  interactive  commands  to  create,  maintain,  and  document 
data  dictionary  contents.  These  standard  commands  are 
listed  in  Table  5.  EATAMANAGER  provides  a  predefined  series 
of  standard  entity-types,  reia ti onshi ?- types,  and  attribute- 
types  that  form  the  system  standard  schema.  These  are 
listed  in  Table  6.  As  shown  in  Table  6,  DATAMANAGER  uses 
only  six  entity-types  in  the  standard  schema.  Those 
elements  exist  within  the  system  as  members  of  a  logical 
hierarchy  as  shown  in  Figure  5.1.  Discussion  in  the  user 
documentation  reveals  that  DATAM AS  AGES  strives  to  provide 
the  capability  to  maintain  all  system  data  while  maintaining 
ease  and  simplicity  of  logical  design. 


TABLE  5 

DATAMANAGER  Standard 

Commands 

ADD 

ALSO 

ALTER 

AUTHORITY 

BULK 

COPY 

DICTIONARY 

DOES 

DROP 

ENCODE 

ENDDMR 

FORMAT 

GLOSSARY 

INSERT 

KEEP 

LIST 

MODIFY 

PERFORM 

PRINT 

PROTECT 

R  SMOVE 

RENAME 

R  EPLACE 

REPORT 

SHOW 

STATUS 

WHAT 

WHICH 

WHO 

WHOSE 

TABLE  6 

DATAMANAGER  Standard  Scheaa  Descriptors 

PROCESS  ENTITY-TYPES 

MODULE  PROGRAM 

SYSTEM 

DATA  ENTITY-TYPES 

FILE  GROUP 

ITEM 

RELATIONSHIP-TYPES 

SEE 

ATTRIBUTE-TYPES 

ACCESS-AUIHOeilY 

ADMINISTRATIVE-DATA 

ALIAS 

CATALOGUE 

COMMENT 

DESCRIPTION 

EFFECTIVE-DATA 

FREQUENCY 

NOTE 

OBSOLETE-DATE 

QUERY 

SECURITY-CLASS 

A  complete  specif icatioa  of  the  lata  resource  of  an 
organisation  requires  the  definition  of  the  characteris¬ 
tics  and  of  the  interrelationships  of  data,  and  of  the 
contexts  in  which  the  data  is  used.  Accordingly,  the 
design  of  DATAMANAGER  provides  for  a  hierarcay  of  member 


57 


types,  within  which  it  is  possible  to  d 
elements  and  assemblages  of  data  and  the  pro 
act  on  the  data.  The  number  of  member  types 
the  basic  hierarchy  has  been  kept  as  small 
rfhile  meetiag  these  requirements.  [Kef.  22] 


SYSTEM 


PROGRAM 


MODULE 


DATABASE 


FILE 


;  r  o  cj  p 

..... 


ITEM 


scribe  all 
assess  that 
defined  for 
as  possible 


Figure  5.1 


DATAMANAGEP * s  Hierarchy  of  Entity-types 


At  the  lowest  level,  an  ITEM  is  a  fundamental  element  of 


data,  the  smallest  unit  within  DATA1ANAGER.  A  GROUP  is  a 
collection  of  items  or  other  groups.  The  third  entity-type, 
the  FILE,  caa  either  be  implemented  as  a  traditional  file 
organization  (a  collection  of  data  groups,  independent  of  a 
DBMS) ,  or  as  the  equivalent  association  of  data  within  a 
database.  If  DATAM  ANAGER  is  used  with  a  database,  another 
entity-type,  DATABASE,  will  be  provided  with  the  database 
interface  module,  e.y.,  ADABAS,  that  is  selected.  The  new 
member,  in  this  case,  ADABAS- DATABASE,  will  either  replace 
the  FILE  element  within  the  hierarchy,  oc  coexist  by 
residing  between  the  FILE  and  MODULE  elements.  A  MODULE  is 
a  collection  of  data  that  includes  descriptions  of  a  data¬ 
base  (if  used),  FILES,  GROUPS,  and/or  ITEMS.  The  module  is 
the  lowest  unit  that  can  directly  or  indirectLy  manipulate 
data,  and  is  a  subdivision  of  a  PROGRAM.  The  PROGRAM  is 
defined  in  terms  of  collections  of  modules  and  those 
processes  that  input  or  output  data  to/from  the  system.  A 
program  is  executable.  A  SYSTEM  is  the  highest  element  of 
the  DATAMAUA3SR  hierarchy  and  contains  all  subordinate  data 
declarations. 

While  DATAMANA3ER  stresses  simplicity  in  the  logical 
design  of  the  system  standard  schema,  it  can  be  configured 
to  be  highly  extensible.  An  add-on  module,  the  User  Defined 
Syatax  Facility  (rJDSF),  is  required  to  support  user  declara¬ 
tion  of  schema  descriptors.  If  present,  this  facility 
provides  several  unique  capabilities.  First,  in  addition  to 
allowing  the  user  to  define  his  or  her  own  enti ty- types ,  the 
module  allows  the  data  administrator  to  insert  one  (or  more) 
of  three  standard  sets  of  extended  entity- type s.  These  sets 
are : 

1.  The  Extended  Data  Processing  Structure  (EDPS)  which 
provides  additional  entity-types  frequently  used 
within  the  data  processing  environment.  These 
include  PROCEDURE,  SUBROUTINE,  and  DATASET. 

59 


2.  The  Structured  Aa.ii2.sis  Structure  ( S  AS)  which 

provides  entity-types  frequently  used  when 
conducting  structured  design.  These  include 

SU 3PRD  C  ESS  and  DA TASTE 'JCTU3E. 

3.  The  Structured  Development  Structure  (SDS)  which 

strives  to  provide  all  entity-types  necessary  to 
satisfy  the  requirements  of  the  majority  of  poten¬ 
tial  users.  This  collection  of  entity-types  include 
all  those  found  in  the  EDPS  and  SAS  subsets. 

Second,  the  UDSF  module  supports  user  definition  of 
a ttr ibute-types  related  to  both  system  standard  and  user- 
created  entity-types.  Three  distinct  categories  of 

a t tribute-types  are  recognized  within  DATAX A M \ GEE .  These 
ace : 

1 .  Global  (common)  attribute; types  which  will  apply  to 
all  entity-types  within  the  structure,  e.  g.  , 
SECURITY-CODE. 

2.  Generic  at  tribute- types  which  can  be  aided  to  those 
of  a  specific  standard  entity-type,  for  example, 
FILE.  Whenever  a  user  defined  entity-type  is 
created  that  uses  the  standard  en tity- t ype ' s  format 
as  a  oase,  the  generic  attribute- types  of  the  stan¬ 
dard  entity-type  will  be  passed  into  tie  new  entity- 
type. 

3.  Specific  at  tribute- types  whicn  allow  tie  designer  to 
tailor  an  entity-type  to  satisfy  tne  particular 
requirements  of  that  organization. 

Finally,  the  UDSF  module  supports  user  definition  of 
relationship- types  in  both  forward  and  backward  directions. 
This  enables  DATASANAGER  to  support  the  tnree  (or  four) 
relationship  mappings  we  have  previously  described. 

Once  DATAd ANA3ER  is  installed  on  tne  computer,  two  major 
steps  must  be  conducted  before  information  can  oe  entered  in 
the  data  dictionary.  First,  an  empty  data  dictionary  must 


60 


be  defined  using  Controller  commands  (restricted  to  use  by 
the  data  administrator),  DISTIDNARY,  aai  AUTHORITY. 
Briefly,  the  dictionary  must  be  created  and  opened, 
authority  levels  must  be  defined,  and  potential  users  must 
be  identified.  As  the  second  major  implementation  step, 
member  entity- t ypes,  both  standard  and  user-created,  must  be 
be  defined.  Every  session  with  DA TAMAN A3 ER  is  conducted  as 
a  "run",  in  which  a  series  of  system  commands,  specified  by 
the  user,  are  carried  out.  Every  session  must  initiate  with 
the  commands  DICTIONARY  and  AUTHORITY.  After  ceview  of  the 
user  documentation,  this  process  wiLl  probabLy  seem  diffi¬ 
cult  and  confusing  to  most  users,  even  to  those  who  have 
wo  eked  with  other  data  dictionaries.  DATA  MANAGER  is, 
however,  an  impressive,  powerful  package  in  the  hands  of  an 
experienced  user.  Our  sample  database,  STUDENT,  would  be 
entered  as  a  PILE  (or  DATABASE,  if  implemented) .  The  format 
for  an  individual  student’s  record  becomes  a  GROJP,  in  which 
each  data  element,  e.g.,  service,  5SN,  etc.,  becomes  an 
ITEM.  The  structure  of  our  example,  after  implementation  in 
DAT AMANAGER,  would  appear  as  shown  in  figure  5.2. 

DAT AMANA3ER  aggressively  supports  each  of  the  three 
objectives  of  data  dictionary  usage:  data  integrity, 
security,  and  maintenance/documentation.  It  enforces  data 
integrity  through  its  hierarchical  structure  of  entity- 
types,  predefined  standard  schema  relationships,  identifica¬ 
tion  of  aliases,  and  automatic  update  procedures.  System 
definitions  and  error- checking  are  used  to  validate  the 
structural  "cor rectness"  of  each  entity,  relationship,  and 
attribute  as  it  is  created  or  defined.  Once  the  FILE  or 
DATAEASE  is  defined,  DAIAMANAGER  monitors  input  of  data  into 
system  structures  by  comparing  the  input  to  the  appropriate 
ITEM’S  characteristics.  Each  of  the  MSP  products,  including 
tne  DBMS  interfaces,  displays  evidence  that  MSP  recognizes 
the  importance  of  data  integrity  as  a  vital  link  to  effi¬ 
cient  and  dependable  control  of  data. 


61 


00205  PRODUCE  STUDENT  LAYOUTS, 
00206  PRINT  JIVING  DESCRIPTIONS 


************  ***********  *  *  **********************  ******* 
*  DESCRIPTION  OE  STUDENT  * 

***************************** 


*  LEVEL 


* 

* 


1 

*•■ 

2 

*  > 
2 


NAME 
:****! 

STUDENT 

:*******  J 

STUDENT- NAME 

c********  *  **  : 

STUDENT-SSN 


*********** 

*  2  STUDENT- SEE  V 

* 


* 

* 


STUDENT-RANK 


LEU 
:  *  *  *  : 

069 

:  *  *  *  i 

050 
:  *  ** 
on 
:  *  *  *  : 

005 

i  *  **  i 

003 


TYPE 

t  *  *  *  i 

;roup 

■  *  *  *  < 

CM  A  R 

*  *  *  < 
MAR 

*  *  *  < 

MAR 

:  *  *  *  i 

C  M  A  R 


REMARKS 

t ******* '■ 

STUDENT 


* 

* 


50  DIJ/ALPH-NUM  * 

* 

****** 

3  NUN,"/",  * 

2  NUM  '•/'•  4  N'JM  * 
:***  ********* 

05  DIJ/ALPH-NUM  * 

* 

***** 

1  CHAP,''-",  * 

1  NUM  * 

******  ** 


Figure  5.2  STUDENT  example  in  DATAMANAGER  Structure 

The  DATAMANAGER  nucleus  provides  security  by  inclusion 
of  one  type  of  security  mechanism,  password  control.  The 
Controller,  or  dictionary  administrator,  mast  assign  a 
unigue  password  to  each  authorized  user.  Each  user  and 
pass  word  combination  must  be  registered  within  the 
dictionary.  DATAMANAGER  will  reject  any  command  session 
which  does  not  commence  with  an  AJIHDRITY  command  followed 
by  an  authorized  password. 

Several  additional  security  mechanisms  cam  be  provided 
by  including  the  Audit  and  Security  Facility  module  in  the 
system  implementation.  First,  the  Controller  gains  the 
capability  of  registering  generaL  and  specific  security 
levels  within  the  dictionary.  Each  user  may  be  assigned  a 
general  security  level  in  addition  to  the  unigue  password 
previously  assigned.  Within  the  system,  the  Controller  will 
assign  a  specific  Insertion  Security  Level  and  a  specific 


Protection  Security  Level.  A  user  anose  general  level  is 
lower  than  toe  specific  insertion  level  is  not  allowed  to 
insert,  modify,  or  delete  information  withir.  the  data 
dictionary.  This  provides  the  capaoility  to  assign  "read 
only"  access.  A  user  whose  genenl  level  is  lower  than  the 
specific  protection  level,  or  one  who  does  not  have  a 
general  security  level  assigned,  is  not  allowed  to  estailisn 
protection  for  system  aemoers,  or  data  structures. 

Second,  users  who  do  have  a  general  security  level  egual 
to  or  higher  than  the  specific  protection  level  may  use  tne 
PROTECT  command  to  assign  protection  to  specific  members  m 
the  form  of  ACCESS,  ALTER,  and  REMOTE  security  levels.  This 
capability  allows  key  users  to  control,  or  even  prohibit, 
access  to  those  structures  that  they  own.  Any  member  wnich 
is  not  owned  but  does  reguire  security  can  be  assigned  the 
same  three  control  levels  by  the  dictionary  administrator. 

Finally,  the  Audit  module  provides  the  capability  to 
produce  over  500  different  audit  repocts,  using  information 
contained  within  DATAMANA3E8.  The  majority  of  these  reports 
are  reserved  for  use  of  the  dictionary  administrator  alone. 
This  includes  the  capability  of  logging  all  commands  issued 
to  the  system.  Tais  "trace"  mechanism  increases  security  by 
providing  a  record  of  all  entries,  or  attempted  entries,  to 
the  system. 

The  last  significant  objective  of  a  data  dictionary  must 
be  to  support  maintenance  and  documentation  of  the  informa¬ 
tion  contained  within  the  information  system.  DATAMANAGER 
provides  a  set  of  commands  unigue  to  the  maintenance  func¬ 
tion.  A  listing  of  these  is  shown  as  Table  7.  Maintenance 
can  be  supported  during  bota  interactive  and  batch  sessions. 
A  series  of  guer  y  and  repoct  commands  are  provided  with  the 
nucleus  module  to  support  usage  studies,  maintenance,  and 
documentations.  These  commands  are  listed  in  Table  8.  The 
REPORT,  PR  IE  T,  and  GLOSSARY  commands  provide  a  great  deal  of 


b  3 


-  _■. 

TABLE  7 

.Ci 

DATAMANAGER  Maintenance 

Commands 

INSERT 

MODIFY 

ENCODE 

REPLACE 

BULK  ENCODE 

COPY 

ADD 

RENAME 

ALTER 

REMO?  S 

KEEP 

DROP 

F 

ALSO  KEEP 

PERFORM 

J 

TABLE  8 

DATAHANAGEB  Raport/Query  Commands 


Report  Commands 


PRINT 

SULK  REPORT 

LIST 

SWITCH 

REPORT 

SKI? 

GLOSSARY 

TEXT 

SPACE 

BULK  PRINT 

Query  Commands 

WHAT 

WHO 

WHICH 

WHOSE 

DOSS 

SHOW 

information  to  the  dictionary  a dmin sis trator  and  other 
designated  users.  When  system  data  is  modified,  the  query 
and  report  commands  can  be  used  to  provide  updated  documen¬ 
tation  and  records. 

One  additional  EATAMANAGER  capability  warrants  mention 
with  respect  to  maintenance  and  iocumanta tion.  One  system 
entity-type  which  has  not  been  discussed  and  does  not  reside 
in  the  hierarchy  shown  earlier  is  the  COMM AND-STREAK  entity- 
type.  This  structure  is  a  unique  feature  of  DATAMANAGER 


6 


commands  to  be 


tnat  allows  previously  stored  series  of 
executed  by  using  the  PERFDPI!  coamni.  The  use  of  specific 
CDMMA  ND-SIF.EAMs  can  be  compared  to  the  subroutines  of  a 
general  programming  lan ugage.  Rhile  the  COMM  Ah’  D- ST  RE  AM  can 
be  used  ir.  many  ways  within  D  AT  A  M  A  N  A  SER ,  it  becomes  espe¬ 
cially  useful  during  generation  of  reports  and  documentation 
during  maintenance  sessions.  A  "subroutine"  can  be  speci¬ 
fied  that  will  produce  all  standard  reports;  when  system 
information  is  updated,  the  applicable  reports  are  produced 
by  one  simple  PERFORM  command  at  tne  end  of  tne  maintenance 
session. 


C.  ADR  DA TADICT IONARY 


DAT A DICTIONARY  is  one  of  fourteen  separate,  but  highly 
integrated,  software  products  produced  by  Applied  Data 
Fasearch,  Inc  (ADR).  Initially  introduced  in  1578,  the 
integrated  system.  Relational  Information  .Management 
Environment  (RIME)  ,  is  considered  to  be  one  of  the  first 
true  examples  of  the  fourth  generation  of  systems  software. 
An  article  in  Infosystems  states 


Three  conditions  are  certain  in  the  1 981)3. ...  First , 
applications  packages  will  not  meet  the  need  for  most 
applications  that  will  be  computerized.  Second,  systems 
software  products  that  improved  productivity ,  reduced 
application  costs  and  increased  accassibili ty  to  infor¬ 
mation  ir.  the  1  970s  wilL  be  even  more  valuable  ^n  the 
1980s.  And  third,  existing  applications  will  not  be 
ceadilv  rewritten  or  replaced  ana  will  have  to  be  main- 
tained'for  many  years.. ..The  success  or  failure  of  many 
organizations  ta  the  193Ds  will  depend  on  how  effec¬ 
tively  tr.ev  improve  and  integrate  data  processing  in 
their  operations.  This  is  particularly  critical  for 
organizations  that  have  been  traditional  data  processing 
users  over  the  last  20  years  and  have  worked  with  second 
and  third  generation  mainframe  aardware  aod  software 
systems.  [  Sef .  23 ] 


Prior  to  analyzing  ADR's  data  dictionary,  it  is  important  to 
review  briefly  the  objectives  of  fourth  generation  software 
and  integrated  systems  and  to  provide  an  overview  of  RIME. 


Each  of  the  " genera ti ons"  of  system  software  can  be 
identified  by  ora  or  more  significant  advancements.  The 
first  generation  provided  primarily  asseitoiy  language 
programs.  The  second  generation's  gifts  centered  around 
development  of  high-level  languages  and  improved  operating 
systems.  Numerous  advances  surfaced  daring  the  third  gener¬ 
ation,  e.g.,  database  management  systems,  data  dictionaries, 
structured  programming  technigues,  early  efforts  at  decision 
support  systems,  and  program  generators.  During  the  fourth 
generation,  it  is  anticipated  that  advances  will  occur  in 
three  primary  areas:  very  high-level  languages,  relational 
database  management  systems ,  and  tne  automated  office  or 
integrated  information  center.  In  the  latter,  ail  automated 
functions,  including  data  processing,  word  processing,  data¬ 
base  and  file  management,  decision  support,  program  develop¬ 
ment  and  maintenance,  and  communications,  will  be  combined 
into  one  "total"  system.  This  could,  in  theory,  be  accom¬ 
plished  by  one  giant  program,  or,  in  the  case  of  ADR  and 
other  vendors,  as  a  series  of  smaller,  integrated  pac\ages. 

During  1932,  the  U.  S.  Army  awacied  a  contract  ror  the 
lacgest,  most  complex  information  processing  project  ever 
funded  by  the  government.  Named  VIABLE  (Vertical 
Installation  Automation  3aseline) ,  the  project  will  provide 
a  nationwide  automated  network  that  will  connect  forty-seven 
miLitary  bases  to  massive  computec  power  at  five  regional 
data  processing  centers.  The  network  has  been  designed  to 
support  the  management  of  information  in  peacetime  and  in 
times  of  war  and  other  national  emergencies.  During  the 
planning  period,  interest  cantered  on  three  principal  func- 
tional  areas:  communication,  interactive  program  develop¬ 
ment,  and  database  management.  Tne  primary  contractor. 
Electronic  Data  Systems,  selected  11  of  ADR’s  products  for 
use  as  the  base  of  the  VIABLE  system.  A  complete  list  of 
ADR/RIME  elements  is  included  as  Table  9. 


TABLE  9 

Components  of  ADR's  DATCD.M  System 


Component 


Function 


DATAE0M/D3 
DATA  DICTIONARY 
DATAQUEP.Y 
DATA  RE  PORTER 
DATAENTRY 
COBOL/DL 
LIBRARIAN 
ROSCOE 
LOOK 

METACOBOL 
A  OTOFLO W  II 

*  ADR/D-NET 

*  ADR/EMAIL 

*  ADR/IDEAL 


Relational  Database  System 
Resource  Control  System 
En ql  ish- li ice  2 uery  “  Lan  jaaye 
Info.  Retrieval/ Reporting 
On-line  Data  Entry  System 
Extended  Language/ Utilities 
Program  Management  System 
Procram  Maintenance  System 
Real-time  Measurement* System 
Language  Pre-compiler 
System  Development  Tool 
Distributed  Database  'jetworx 
Electronic  Mail  System 
Interactive  Develop.  System 


*  New  ADR  products  which  have  not  yet  been 
included  in  the  Army's  VIABLE  project. 


Some  of  these  elements  can  be  considered  to  be  high- 
priced  extras  or  application-specialized  options.  If  an 
organization  were  to  utilize  ail  components,  users  would 
have  access  to  a  compete  database  system  with  data 
dictionary,  a  relational  guery  language,  report  and  graph 
generators,  extended  C030L  compiLer,  program  develops ent 
support,  distributed  local  data  network.,  electronic  mail 
system,  and  more. 

According  to  ADR  literature,  tae  aeart  of  the  integrated 
system  is  DAT ADICT IONARY.  The  company's  database  system, 
DATACOM/DB,  a  true  relational  dataoase3  system  that  utilizes 
a  patented  flexible  data  structure,  was  designed  especially 
to  interact  with  DATADIETIDNA RY.  As  an  active  dictionur  • 


3 A  relational  database  is  one  in  waica  the  relationships 
between  data  are  implied  by  the  values  of  the  data.  For 
example,  two  records  are  related  if  taev  hive  the  same 
attribute,  as  3TJDENT  and  PROFSSSDR  ace  related  bv  the  fact 
that  they  are  associated  with  a  particular  ELA33. 


system,  DATADICTIONARY  is  queried  by  ail  other  c opponents  of 
the  system  prior  to  access  of  system  information.  This 
maximizes  data  integrity  while  minimizing  data  redundancy. 
DAT ADICTIONARY  offers  a  menu-driven  user  interface.  It 
provides  security,  supplies  full  documentation /maintenance 
capabilities,  and  can  be  extended  to  interact  with  future 
system  products  and  to  support  future  user  requirements. 
The  documentation  provided  with  DAT  ADI Z II D NARY  and  other  ADR 
packages  is  almost  overwhelming  in  its  completeness.  The 
dictionary  alone  has  fifteen  separate  volumes.  While  an 
extremely  capable  system,  DATADi:  TIDNARY  is  aot  one  that 
will  be  easily  or  quickly  mastered. 

DATADICTIONARY  provides  20  standard  entity-types  in  its 
system  standard  schema  and  supports  user  creation  of  addi¬ 
tional,  more  application-specif ic  schema  descriptors.  For 
most  applications,  the  standard  types  listed  in  Table  10 


TABLE  10 

ADR  DATADICTIOSARY 

Standard  Entity 

-types 

DATABASE 

KEY 

SYSTEM 

REPORT 

AREA 

ELEMENT 

PROGRAM 

JOB 

FILE 

LIBRARY 

MOD J  LE 

STEP 

RECORD 

MEMBER 

DAIAVIErf 

AUTHORIZATION 

FIELD 

PANEL 

PERSON 

NODE 

will  prove  to  be  sufficient.  DATADICTI ONARY  maintains  a 
logical  hierarchy  among  the  principle  standard  entity- types, 
as  indicated  in  Figure  5.3.  da ny  of  the  standard  entity- 
types  are  provided  with  primary  cela tioa sa ips  already 
defined  with  key  subordinate  entity-types.  For  example,  in 
ouc  STUDENT  example,  we  will  initially  use  tie  entity-type 


68 


DATABASE 


FIELD 


Figure  5.3  A  Logical  Hierarchy  of  Entity-types 


DATABASE  to  create  our  sample  dataaase.  When  we  define  the 
database  entity-occurrence,  the  D ATA3ASE-APEA  relationship 
is  automatically  provide!.  Similarly,  whea  the  area- 
occurrence  is  defined,  the  AREA-FILE  relationship  is  estab¬ 
lished  by  DAT A  DICTION ARY.  In  the  case  of  RECORD,  creation 
of  an  occurrence  provides  three  relationship- types: 
RECDRD-FIELD,  RECDRD-KEY,  and  RE C D R D- EL EN ENT.  These  three 
relationships,  at  the  lowest  level  of  the  logical  hierarchy, 
support  actual  entry  of  attribute- values,  or  data.  Whether 
system-defined  or  user-created,  all  relations  hip-typos  in 
DA  T  A  DICTIONARY  have  four  attributes: 


1.  Kgiat  ionshi£2S!i£Ei£2  ““  describes  ths  nan  her  of 

entity- occurrences  which  ace  the  subjects  and  the 
objects  of  this  relationship,  a.  3.  tae  type  of  the 
relationship.  DATA  DICTION ART  supports  four  types  of 

relationship  mappings,  i.  e.  one-to-one,  one- to- many , 
many-to-one,  and  many-to- many. 

2.  Reg uiced- relation  ship  —  describes  whether  each 

entit y-occurrence  in  the  named  object  entity-type  is 
to  be  related  to  at  least  one  entity- occurrence  of 
the  named  subject  entity-type. 

3.  Automatic-relationship  -  describes  waethrr  each 

entity- occurrence  of  the  named  object  entitj-type  is 
to  be  automatically  related  to  an  entity-occurrence 
of  the  named  subject  entity-type  when  tae  riject  is 
added. 

4.  Ordered- relationship  -  describes  whether  the  order  of 
relationships  added  in  tais  relations  hi p- type  is 
significant.  An  ocdered-relationship  allows  entity- 
occurrences  to  be  retrieved  ana  displayed  in  a 
specific  order. 

If  using  the  interactive  version,  DAIADICTIONARY  Online, 
the  user  will  be  prompted  by  a  series  of  panels,  o::  menus. 
The  Master  Menu  is  displayed  in  Figure  5.4.  The  master  menu 
supports  creation,  modification,  aad  deletioi  of  entity- 
occurrences.  Additionally,  it  provides  access  to  all  other 
system  menus  through  option  (7).  The  following  procedures 
would  be  utilized  to  create  the  STUDENT  etample  within 
DA T A D IC T 10 N A R y .  First,  the  Add  Detail  routine,  opt:.on  (2), 
is  selected.  In  answering  the  system  prompts,  the  user 
creates  the  new  entity-occurrence,  DATABASE.  STUDENT  in  the 
following  dialog: 


70 


r  ■ 


i 

. 

D 

I 


« 


rc 


L« 


I 

u 


jr  1  l-  \r 


=> _ 

DDOL:  SELECTION  CRITERIA  FOR  DETAIL  ADD 

LV  TY  ENTITY  RECORD  DD  OCCURRENCE  NAME  VE*  FT  AT 

00  E  DATABASE  STUDENT  001 

CURRENT  OCCURRENCE  QUALIFIER : 

DATABASE  STUDENT  (001)  TEST 

*********  **************  ******************************* 


ATTRIBUTE 

DESCRIPTION 

CONTROLLER 

AUTHOR 

BASE- ID 

BASE-TYPE 

DBMS-USED 


DETAIL  ADD 

VALUE 

NPS  STUDENT  DATABASE 
DEPARTMENT  OF  REGISTRAR 
REGISTRAR 
001 

ADR/DB 

RELATIONAL 


4 


MASTER  MENU 

ENTER  THE  REQUESTED  OPTION  ==>  THERE  ARE  03  OPTIONS 


1.  DISPLAY  MENU 

2.  ADD  DETAIL 

3.  DELETE  DETAIL 

4.  UPDATE  DETAIL 
5-  COPY 

'6.  STATUS  CHANGE 
7.  SUPPDRT  MENU 

8-  SECURITY 


MENU  FOR  DISPLAY  FUNCTIONS 
ADD  DETAIL  ENTITY- OCCU R R ENCZ 
DELETE  DETAIL  ENTITY-OCCURRENCE 
UPDATE  DETAIL  ENTIIY-OCCUR  EIICS 
COPY/MODEL  ENTITY-OCCURRENCE 
CHANGE  ENTITY-OCCURRENCE  STATUS 
ALIAS,  DESCRIPTOR,  RELATIONSHIP, 
TEXT.  AND  QLIST 

OCCURRENCE  SECURITY  MAINTENANCE 


L _ 


Figure  5.4  ADR  DATADICTIONARY  Master  Menu 


Each  of  the  2D  standard  entity-types  will 
key  attributes.  Values  for  these  at 
entered  during  the  Add  Detail  routine. 

DA TA BAS F  entity-type,  and  as  was  shown  abo 
butes  are  DESCRIPTION,  CONTROLLER, 
BASE-TYPE,  and  DBMS-USED. 

In  similar  fashion,  the  user  Bust  crea 
logical  structures,  AREA. STUDENT,  FI 
RECORD. STUDENT.  As  each  occurrence  is  cr 


con 

tain 

predefi 

ned 

tribute- 

-types 

are 

In 

the  t 

;ase  of 

the 

ve. 

the 

key  a  1 1 

ri- 

AUT 

HDR  , 

BAST- 

IDr 

te 

the  subordin 

ate 

LE. 

STUD! 

2  NT , 

and 

eat 

ed. 

it  must 

be 

4 


i 


« 


71 


r-'-lated  to  tne  next  highest  entity-occurrence  in  the  logical 
hierarchy,  e.g.,  FILE. STUDENT  Bust  be  related  to 
AREA . STUDENT.  For  this  process,  the  user  invokes  the 
Relationship  Definition  Panel  to  lefine  the  relationships. 
DA  TA  DICTIONAF  Y  will  respond  with  tae  Relationship  Definition 
Display  which  presents  the  characteristics  of  each  of  the 
relationships  as  it  is  enacted.  Exaaples  of  these  panels 
are  shown  below: 

=> _ _ _ _ 

DDDL:  RELATIONSHIP  DEFINITION 

RELATIONSHIP  NAME  {INTERNAL 

SUBJECT  ENTITY  TYPE  DATAB AS E. STUDENT 

OBJECT  ENTITY  TYPE  AREA. STUDENT 


=> _ 

RELATIONSHIP  DEFINITION  DISPLAY 

SELECTION: 

{INTERNAL  DATABASE.  STUDENT  AREA. STUDENT 
NAME  SUB J  TYPE  OBJ  TYPE  MAP  RE2  AUTO  OFFER 

{INTERNAL  OATEBASE  AREA  1M  Y  N  N 

As  a  final  step  in  installicg  the  STUDENT  database,  QLISI 
commands  must  be  used  to  define  specific  fields,  ke-ys,  and 
elements  within  RECORD. STUDENT.  This  is  the  point  where  the 
specific  attributes  of  the  STUDENT  example,  e.g.,  SSN,  Name, 
Secvice,  and  Rank,  are  entered  into  the  database  design. 
The  user  defines  attribute  name,  parent,  class,  type, 
length,  and  number  of  repetitions.  One  example  of  this 
process  is  as  follows: 

=> _ 

DDOL:  SELECTION  CRITERIA  FOR  RECORD  OLIST  1  AIN T 
LV  TY  ENTITY  RECORD  DD  OCCURRENCE  NAME  V ER  STAT 

00  E  RECORD  STUDENT  TEST 

CURRENT  OCCURRENCE  QUALIFIER: 

RECORD  STUDENT  (001)  TEST 

t  *********************** s****************************** 

FECORD  QLIST  MAINTENANCE 

E  FC  FIELD  NAME  PARENT  NAME  INSERT  AFT  C  T  LEN  REP 

A  SERVICE  SSN  NAME  S  C  004  001 


72 


o 


Looking  at  the  last  line  of  the  figure,  the  jser  has  indi¬ 
cated  the  following: 

FC  (function  code)  =  Aid  a  field 
Field  Name  =  SERVICE 

Parent  [lame  =  SEN  (in  this  case,  this  is  tie  Key  field) 
Insert  After  =  NAME  ( N  3  M  3  EE  ' s  value  will  follow  FAME'S) 
C  (Class)  =  Simple  (as  oppose!  to  a  compound  field) 

T  (Type)  =  Character  (vice  a  numeric  or  binary  field) 
LEW  (Length  of  Field)  =  4 

HEP  (Nunoer  of  repetitions)  =  0D1  (vice  a  repeating 

field) 

At  this  point,  the  schema  of  STUDENT  nas  been  entered  into 
DATA  DICTION  AP.Z.  The  user  may  now  use  DATACOM/DB  facilities 
to  enter  attribute-values  into  the  system.  Upon  completion, 
the  database  administrator  or  authorized  users  can  create  as 
many  external  views,  or  subschemas,  as  desired. 

DATADICTIONASZ  receives  high  marks  in  the  areas  of  data 
integrity,  security,  and  docuaentation/maintenance. 

DA T A DICT ION A RT ' s  logical  hierarchy  of  structures  and  system¬ 
atic  installation  procedures  tend  to  enforce  data  integrity. 
Tae  dictionary's  extension  routines  and  view  generation 
processes  have  bean  written  to  ensure  that  data  integrity  is 
maintained  throughout  expansion  or  specialization  of  the 
database.  To  enforce  security,  DAT ADICIION A Ri  provides 
multiple  layers  of  protection.  Two  separata  and  independent 
mechanisms  are  provided  in  all  implementations.  These  are 
(1)  use  of  entity  passwords,  and  (2)  inclusion  of  locks  and 
override  codes.  If  the  installation  is  the  Online  version, 
a  third  mechanism,  user  validation,  is  available.  As  each 
entity  is  created,  or  at  any  time  afterwards,  a  four-digit 
password  can  be  assigned  to  that  entity.  Passwords  can  be 
either  unique  or  assigned  to  a  series  of  related  entities. 
Any  user  attempting  to  modify  or  access  a  passw or d- protec  ted 
entity-occurrence  will  be  queried  to  provide  the  applicable 
password  prior  to  gaining  access.  The  second  layer  of 
protection  centers  on  use  of  LDC.<  and  OVERRIDE  codes. 
Unlike  passwords,  which  either  allow  or  proiibit  access, 
lock  codes  can  be  utilized  to  limit  the  deacee  of  access 
granted.  Three  levels  of  security  are  provided: 

73 


LOCKO 


y 

y 


!«■ 

k 

t 


IT* 

L 


No  restrictions  exist  on  an  entitv 
(default  setting) 

IOCK1  The  entity  cannot  be  update!  or  deleted 

without  an  override  code.  me  entity 
can  be  copied,  displayed,  or  printed 
witnout  restrictions. 

LOCK2  No  action  will  be  permitted  unless  tne 

override  code  is  given  to  tie  system. 

The  actual  override  codes  will  be  lsed  dirtionac y-wide,  that 
is,  a  single  code  will  exist  to  satisfy  LOTS  1  conditions 
while  another  code  exists  to  access  entities  protected  by 
L0TK2.  Finally,  if  using  DATA  DITTIDN  AR  Y  Online,  the  Lignest 
layer  of  security  becomes  user  validation.  The  name  of  each 
user  of  the  system  is  defined  as  a  PERSON  entity-type.  Each 
entity-occurrence  will  include  a  unique  password  which  must 
be  provided  to  eater  the  system  through  the  online  inter¬ 
face.  Four  levels  of  authorization  are  supported  by 
DA  TAD ICT 10 NARY; 

_DIS  The  user  is  allowed  to  display  all  data  in 

the  dictionary. 

_UPD  The  user  is  allowed  to  update  the  dictionary. 

_COP  The  user  is  allowed  to  copy  an  entity. 

_ ADM  The  user  is  allowed  the  use  of  all  commands 

and  is  allowed  to  process  all  panels. 

Authorization  at  one  level  will  automatically  provide  all 

lower  authorizations. 

ADR’s  mult i pie- layered  approach  to  security  provides  a 
system  that  is  both  highly  flexible  and  very  secure.  The 
database  administrator  will  be  able  to  provide  whatever 
degree  of  access  that  is  required  to  eaca  individual  user  as 
well  as  to  each  group  of  users  within  the  system.  If  one 
layer  of  secucity  is  broken,  access  will  be  prevented  by  the 
other  security  mechanisms. 


f  Invocation  of  any  function  thus  authorized  on  any  entity 

is  still  subject  to  the  password  and  lock  orovision 
discussed  earlier  in  this  section.  Thus,  a  user  with 
i  BDD_DPD  authorization  cannot  modify  an  entity  that  is 

i  password  protected  unless  the  required  password  is 

supplied.  ^Bef.  24] 

i 


74 


DAT ADICTTD H A RY  provides  extensive  capabiiites  to  support 
naintenance  aad  documentation  of  the  data  dictionary.  It 
can  be  maintained  by  using  either  the  Online  maintenance 
facility  or  available  batch  commands.  If  using  the  online 
facility,  a  series  of  screen  panels  will  again  guide  the 
user  through  the  desired  maintenance  activity.  This 
facility  will  greatly  enhance  individual  changes,  however, 
major  changes  affecting  numerous  entities  would  be  initiated 
most  easily  through  batch  commands.  In  either  case,  mainte¬ 
nance  centers  around  four  principal  functions: 

1.  adding,  copying,  updating,  or  deLeting  system 

er.  titie  s 

2.  search  for,  identification  of,  and  creation  of 

entity  aliases 

3.  maintenance  of  descriptors  and  schema  descriptors 

4.  maintenance  of  descriptive  tests  associated  with 
system  entities 

Similarly,  DAI ADICT I3NART  provides  numerous  report 
generation  capabilities,  most  of  waich  can  be  initiated 
through  either  batch  or  Online  Maintenance  sessions. 
Principal  report  types  are  shown  in  Table  11.  Senerated 
reports  will  support  both  the  initial  generation  of  user 
databases  and  subsequent  maintenance  of  system  data  and  the 
structures  utilized  to  display  it. 


TABLE  11 

Principal  Reports  of  DATADICIIONART 


INDEX 

FIELD 

DESEPIPTDF 


INDENTED 

TEXT 

RELATIONSHIPS 


DETAIL 

ALIAS 

DEFINITIONS 


D 


ORACLE 


CF.ACLE  is  a  relational  database  management  system  devel¬ 
oped  by  Relational  Software  Incorporated  of  ‘lenlo  Park, 
California.  It  was  originally  developed  for  use  with 
Digital  Equipment  Corporation  PDP  minicomputers  and  has  beer, 
converted  to  operate  on  IBM  mainframes  as  well  [Ref.  25]. 
Included  in  ORACLE'  is  a  dependent  data  dictionary  that 
performs  a  limited  number  of  the  functions  discussed  in 
previous  chapters. 

Data  is  stored  in  ORACLE  as  relations,  or  two- 
dimensional  tables,  which  are  organized  into  rows  and 
columns.  SQL  (System  Query  Language)  is  used  for  query, 
manipulation,  definition,  and  control  of  the  ORACLE  data¬ 
base.  Information  about  the  contents  of  a  table,  its 
creator,  authorized  users,  calling  programs,  and  associated 
views  is  kept  in  the  data  dictionary  and  can  be  retrieved 
via  SQL  commands. 

ORACLE’S  logical  hierarchy  of  structures,  as  shown  in 
Figure  5.5,  3 emon st rates  the  comparative  simplicity  of  this 
system.  In  this  figure,  a  single  arrowhead  represents  a 
one-to-one  relationship  while  the  double  arrowheads  signify 
one-to-many  relationships.  The  iataoase  is  divided  into 
logical  partitions  which  can  only  be  created  or  altered  by 
the  database  administrator.  When  users  define  tables,  the 
system  allocates  memory  for  one  indexspace  and  one  data- 
space.  The  indexspace  is  used  by  the  database/dictionary  to 
store  information  about  the  table  wtiile  the  iataspace  is 
utilized  for  storing  the  actual  information.  As  data  is 
entered  into  the  database,  the  system  automatically  appends 
extents  (and  pages)  as  necessary  to  support  specific  tables. 

ORACLE’S  13  data  dictionary  tables  are  described  in 
Figure  5.6.  An  example  of  one  of  the  tables,  CATALOG, 
appears  in  Figure  5.7.  Tables  wit.n  the  "SYS"  prefix  include 


ORACLE  DAfAEASE 


PARTITION (3) 


TABLE  (s) 


INDEX SPACE 


DATASPACS 


EXTENT (S) 


EXTENT  (3) 


PAGE  (s) 


PAGE  (s) 


Figure  5.5  ORACLE'S  Logical  Hierarchy 


information  on  system  data  in  addition  to  the  user's  data. 
For  example,  a  display  of  SYSCATAL03  might  appear  as  Figure 
5.3.  In  this  particular  example,  there  are  23  entries,  18 
of  which  are  system  tables  or  views. 

ORACLE'S  data  dictionary  is  automatically  updated  when¬ 
ever  any  additions  or  deletions  are  made  to  tie  database  or 
when  views  are  defined  or  user  privileges  are  cnanged,  so  it 
alvays  has  a  current  description  of  the  daraoase.  As  an 
example,  assume  a  new  view,  NA7YVIER,  is  created  using  the 
SQL  CREATE  command: 

JFI>  CREATE  YIEH  NAVYVIE3  AS 
2  SELECT  NASE,SSN, RANK 


DTA3 

-  Description  of  tables  £  views  in  Oracle  Data 
dictiona  ry 

SYSCATALDS 

-  Profile  of  tables  £  views  accessible  to  user 

CATALOG 

-  Profile  ot  tables  accessible  to  user,  excluding 
data  dictionary 

TAB 

-  List  of  tables,  views,  clusters,  and  syaonymns 
created  by  user 

SYSCOLUMNS 

-  Specif ications  of  columns  in  accessible  tables 
and  views 

COLUMNS 

-  Specif ications  of  columns  in  tanles  (excluding 
data  dictionary) 

COL 

-  Sped f ications  of  columns  in  tables  created 
by  the  user 

SYSINDEXES 

-  List  of  indexes,  underlying  columns,  creator, 
and  options 

INDEXES 

-  Indexes  created  by  user  £  indexes  on  tables 
created  by  user 

SPACES 

-  Selection  of  space  definitions  for  creating 
tables  £  clusters 

VIEWS 

-  Quotations  of  the  S2L  statements  upon  waich 
views  ace  based 

SYSTABAUIH 

-  Directory  of  access  authorization  granted  by 
or  to  the  user 

E  XTE  NTS 

-  Data  structure  of  extents  within  tables 

STORAGE 

-  Data  and  Index  storage  allocations  for  user's 
own  tables 

SYSSTORAGS 

-  Summary  of  all  database  storage  --  for  DBA 
use  only 

SYSUSERAUTH 

-  Master  list  of  Oracle  users  --  for  DBA  use  only 

S YSEXTENT3 

-  Data  structure  of  tables  throughout  system 

—  for  DBA  use  only 

PARTITIONS 

-  File  structure  of  files  within  partitions 

—  for  DBA  use  only 


Figure  5.6  Tables  of  the  ORACLE  Data  Dictionary 


3  FROM  STODENTS 

«  WHERE  SERVICE  =  "USN" 

View  created. 


73 


NAME 


CREATOR 


I A  3  T  Y  ?  E 


TABID 


STUDENTS 

ARMY  VIE  if 

LAND  IN 
OWENS 

T  A  3  L  E 

VIEW 

2  2363  9 
263333 

Figure 

5.7  ORACLE  CATALOG  Listing 

NAME 

CREATOR 

TABTYPE 

IAEID 

HELP 

SYSTEM 

T  A3  L  2 

9935 

DUAL 

SYSTEM 

TABLE 

1  045  7 

STORAGE 

SYSTEM 

VIEW 

11520 

EXTENTS 

SYSTEM 

VIEW 

11776 

SPACES 

SYSTEM 

VIEW 

1  2  2  S  3 

S YSCOLOM NS 

SYSTEM 

VIES 

12544 

COLUMNS 

SYSTEM 

VIEW 

12900 

SYSCATALOG 

SYSTEM 

VIEW 

1  3056 

CATALOG 

SYSTEM 

VIEW 

13312 

SYSINDEXES 

SYSTEM 

VIEW 

1  3563 

INDEXES 

SYSTEM 

VIEW 

13824 

VI  EPS 

SYSTEM 

VIEW 

1  4030 

SYSABAUTH 

SYSTEM 

VIEW 

1  4336 

TAE 

SYSTEM 

VIEW 

1  464  3 

COL 

SYSTEM 

VIEW 

1510  4 

EXPTAO 

SYSTEM 

VIEW 

1  5360 

EXPVEW 

SYSTEM 

VIEW 

15  616 

DTAE 

SYSTEM 

TABLE 

1  5373 

STUDENTS 

LANDIN 

TABLE 

228609 

ARMY VIES 

OWENS 

VIEW 

263600 

Figure  5.8  ORACLE  SYSCATAL03  Listing 


Upon  completion  of  this  dialog,  all  ORACLE  data  dictionary 
files  will  have  been  automatically  updated  to  include  the 
new  view.  Tie  CATALOG  table  would  now  appear  as  shown  in 
Figure  5.9. 

ORACLE  provides  security  by  using  its  data  dictionary  to 
control  access  within  the  database.  The  database  adminis¬ 
trator  (DBA)  provides  the  first  level  of  access  by  entering 
the  user’s  name  into  the  data  dictionary's  SYSUSERAUTH 


1 


N'AKS 

STUDENTS 

AfiMYVIEW 

NAVYVIEW 


CREATOR 

LA  NO  IN 

OWENS 

LANDIN 


T  A  B  T  Y  ?  E 

TABLE 
71  EW 
VIEW 


TABID 

228609 

268800 

288240 


Figure  5.9  ORACLE  CATALOG  Listing  With  tie*  7iew 

ta  o le .  Initial  privileges,  or  subsequent  changes  to  author¬ 
ized  privileges,  are  issued  using  the  GRANT  or  REVOKE 
commands.  ORACLE  also  supports  mui ti - la yer e d  assess:  in 
addition  to  privileges  authorized  by  the  DBA,  a  user  can 
grant  various  degrees  of  access  privilege  to  others  for 
taoles  or  views  which  he  or  she  has  created.  A  list  of 
current  authorizations  is  maintained  in  the  dictionary’s 
SYSIABAUTH  view,  as  shown  in  Figure  5.10. 

ORACLE  is  a  strong  performer  in  the  data  integrity 
category.  Since  the  data  dictionary  is  an  integral  part  of 
the  database  system,  data  is  only  maintained  at  one  location 
within  the  database.  This  prevents  two  users  from  acquiring 
data  from  the  database  and  getting  different  res ults.  If 
data  were  duplicated  within  the  system,  it  would  be  possible 
for  one  location  to  be  updated  while  the  otier  was  not. 
Figures  5.7  tarough  5.10  show  that  the  ORACLE  user  will  deal 
mostly  with  subsets  of  the  database,  or  subschemas. 

ORACLE’S  documentation  is  limited  to  the  information 
that  can  be  found  in  the  data  dictionary  tables.  It  does 
not  provide  information  about  which  users  use  which  data, 
how  often  data  is  used,  or  when  it  is  used.  ORACLE  does 
support  maintainability  through  automatic  update  of  its 
taoies  and  thcough  the  concept  of  data  independence.  This 
concept  implies  a  separ^t-ion  of  data  definitions  from  the 
programs  or  queries  that  might  access  the  data  in  the 


80 


GRANTOR 


T A  B  T  Y?  E 


A J  THORITY 


Gr  ANTSE 


CREATOR 

TNAME 

SYSTEM 

SYSTEM 

PUBLIC 

HELP 

TABLE 

SELECT 

SYSTEM 

SYSTEM 

PUBLIC 

DUAL 

TABLE 

SELECT 

SYSTEM 

SYSTEM 

PUBLIC 

SYSCOLUMNS 

VIEW 

S  S  L  EC  T 

SYSTEM 

SYSTEM 

PUBLIC 

COLUMNS 

VIEW 

SELECT 

SYSTEM 

SYSTEM 

PUBLIC 

SYSCATALOG 

VIEW 

SELECT 

SYSTEM 

SYSTEM 

PUBLIC 

CATALOG 

VIEW 

SELECT 

SYSTEM 

SYSTEM 

PUBLIC 

SYSINDEXES 

VIEW 

(*  7  r  T?  r*  m 

J  u  L  uo  x 

SYSTEM 

SYSTEM 

PUBLIC 

SYSTA3AUTH 

VIEW 

SELECT 

SYSTEM 

SYSTEM 

PUBLIC 

TAB 

VIEW 

SELECT 

SYSTEM 

SYSTEM 

PUBLIC 

EXPTA3 

VIEW 

SELECT 

SYSTEM 

SYSTEM 

PUBLIC 

EXPVIEW 

VIEW 

SELECT 

SYSTEM 

SYSTEM 

PUBLIC 

DTAB 

VIEW 

SELECT 

LAND IN 
LANDIN 

OWENS 

STUDENTS 

TABLE 

SELECT 

LANDIN 

LANDIN 

OWENS 

STUDENTS 

TABLE 

DELETE 

LANDIN 

LANDIN 

OWENS 

STUDENTS 

TABLE 

UPDATE 

OWENS 

OWEN'S 

OWENS 

ARM YVIEW 

VIEW 

DROP 

OWENS 

OWENS 

OWENS 

ARM YVIEW 

VIEW 

3  E  L  EC  T 

LANDIN 

LANDIN 

OWENS 

NAV YVIEW 

VIEW 

SELECT 

Figure  5.13  ORACLE  SYSTABAUTH  Listing  Ear  3ser  Owens 


▼ 


e 

S 


database.  This  allows  the  structures  or  definitions  of  the 
data  constructs  to  be  modified  without  necessitating  changes 
in  the  pro  grass  or  queries  that  access  the  database.  If  a 
table  is  extensively  modified,  a  view  can  be  created  to 
interface  with  current  programs.  ORACLE'S  data  integrity 
will  maintain  the  currency  of  tae  view  by  automatically 
updating  the  view  whenever  applicable  portions  of  the 
governing  table  are  modified. 

ORACLE  does  provide  the  basic  functions  of  definition, 
update,  retrieval,  and  software  interface.  However,  like 
other  relational  database  manajinent  systems  with  dependent 
data  dictionaries,  it  does  not  offer  the  range  of  functions 
of  the  other  data  dictionaries  discussed  in  this  cr.apter, 
nor  does  it  accomplish  satisfactorily  the  three  main  objec¬ 
tives  of  data  management  discussed  in  Chapter  I/.  ORACLE’S 
data  dictionary 


provides  little  more  than  a  netnod  of  defining  the 
schema.  The  relational  database  management  system  'dic¬ 
tionary'  arises  because  the  system  needs  a  wav  to  store 
the  schema  and  it  does  this  through  the  use  of  tr.e  same 
tables  (relations)  as  it  uses  for  the  main  database. 
;  Ref.  26]. 


ORACLE  could,  however,  serve  as  a  good  starting  point  for 
further  development. 


The  modern  relational  DBMS  does  provide  a  very  good 
basis  for  a  good  dictionary  system.  Ihis  is  because  the 
normal  relational  DBMS  is  equipped  with  two  features 
that  help  in  making  the  implementation  easy: 

1.  Many  relational  DBMS  now  have  a  " triggerino"  feature 
that  causes  a  procedure  to  be  invoked  on  some  data 
condition  or  event.  Such  a  feature  is  needed  to  tie  a 
DBMS  to  a  dictionary  system. 

2.  The  availability  of  the  schema  tables  substantially 
reduces  the  effort  in  implementing  the  dictionary 
system.  [Ref.  27] 


The  most  important  shortcoming  of  ORACLE'S  data  dictionary 
is  its  lack  of  documentation,  witnout  which  it  is  difficult 


82 


to  manage  all  aspects  of  an  organization's  data.  If  this 
objective  were  incorporated  into  tae  system,  ORACLE  would  be 
a  much  more  valuable  tool. 

E.  COHPARISON  3  F  DATA  DESIGNER,  DATAMANAGER, 

DAT A DIC T ID N  A  RI ,  AND  ORACLE 

Now  that  four  representative  samples  of  commercial  data 
dictionaries  have  been  evaluated,  we  will  compare  the 
primary  features  of  each  and  identify  which  or.e(s)  have  come 
closest  to  providing  the  features  of  our  ideal  system.  Far 
ease  of  comparison,  we  have  grouped  all  of  the  features, 
functions,  and  guidelines  that  have  been  identified  into  tie 
six  evaluation  criteria  categories:  system  standard  schema 

£  extensibility,  command  and  query  languages,  ease  of  use 
(including  menus),  security,  documentation  and  reports,  and 
application  interfaces. 

As  the  data  dictionaries  are  evaluated  in  each  of  the 
six  categories,  a  brief  chart  will  be  used  to  compare  each 
dictionary  against  the  FIPS  standards.  Each  chart  will 
compare  five  data  dictionaries: 

FIPS  =  The  idaal/FIPS  data  dictionary 
MSP  =  MSP  DATAMANAGER 
ADR  =  ADR  DAT A  DICTION ART 
DDE  =  DATA  DESIGNER 
ORA  =  ORACLE  DEMS/DD 

A  very  subjective  scoring  system  will  be  used,  with  grades 
ranging  from  three  to  zero.  The  ideal/FIPS  standard  will 
automatically  receive  a  grade  of  '3''  in  each  area,  repre¬ 
senting  the  ideal  combination  of  features.  The  meaning  of 
each  grade  is  as  follows: 

"3"  =  Very  strong  performance  by  DD ;  no  criticism 

"2"  =  Good  performance  by  DD ;  one  or  more  significant 
shortcomings 

"1"  =  (1)  DD  supports  functional  area  very  poorly; 

(2)  DD  does  not  support  functional  area,  but 
another  component  of  the  system  does. 

"0"  =  DD  (and  remainder  of  system)  fails  to  support 
this  function 


First,  tha  data  dictionary  should  provide  a  system  stan¬ 
dard  schema  aid  the  capability  to  add  new  entities,  rela¬ 
tionships,  and  attributes  to  it.  is  stowr.  in  Table  12, 
while  DATADT  C  T I ON  A  R Y  and  DATAMANA3ER  closely  resemble  the 
ideal  system  proposed  by  the  FIPS,  DATA  DESIGNER  and,  in 
particular,  ORACLE  fail  to  provide  these  capabilities. 
DATAMANAGER  supports  three  "add-on"  collections  of  schema 
descriptors.  When  added  to  the  standard  schema,  each  will 
increase  DATAM&HAGEF’s  capabilites  to  support  a  specific 
application,  e.g.,  programming. 


TABLE  12 

Category  One:  Schemas  ani  Extensibility 


Functional  Category 

FIPS 

ISP  | 

ADR 

DDE  | 

OF.  A 

System  Stand.  Schema 

3 

3  1 

3 

i  ! 

0 

Entity-types 
Relationship- types 
Attribute-types 

!iii 

Jii 

(  20 ) 
13) 
(50*) 

(2)  ! 
Hi 

Hi 

DA/User  Extensible 

3 

3 

3 

3  1 

0 

Category  Subtotals 

6 

6 

6 

1  j 

0 

Second,  the  data  dictionary  should  provide  a  command 
language  that  will  support  gueries  from  users  while 
reserving  some  capabilities  solely  for  the  use  of  the 
dictionary  administrator.  This  last  ingredient  supports 
security  and  data  integrity.  Again,  as  seen  in  Table  13, 
DA T A DICT ION A EY  and  DATAMANAGER  provide  all  capabilities  of 
tha  FIPS  standacd  while  the  other  two  lag  behind. 


TABLE  13 

Categocy  Two:  Comma nd/Qtiery  Languages 


Functional  Category  j 

FIPS 

.ISP  , 

ADR  , 

DDE  | 

ORA 

CM D  Interface  Lang,  j 

3 

3 

3 

”S"I 

2 

Query  Commands  1 

3 

3  , 

3  i 

1 

1 

DA-Only  Commands  I 

3  'i 

3 

3 

D 

2 

Category  Subtotals 

Q  | 

9  1 

9 

4 

5 

L _ 


J 


-4 


Third,  the  ideal  data  dictionary  must  be  relatively  easy 
to  use,  yet  still  powerful  enough  to  support  the  experienced 
user.  One  of  the  major  ingredients  of  user-friendliness  is 
a  menu-driven  (or  panel-driven)  format.  Good,  easy-to- 
unierstand  examples  are  another  important  ail  to  the  new 
user.  Table  1 '4  reveals  that,  in  our  opinion,  none  of  the 
four  systems  can  be  considered  easy  to  use.  Looking  at  the 
four  as  a  group,  two  fail  to  use  menus,  one  provides  exam¬ 
ples  which  are  complex  and  hard  to  understand,  and  the 
fourth  fails  to  provide  either  menus  or  good  examples. 

Fourth,  security  is  one  of  the  primary  objectives  of  a 
data  dictionary.  It  should  not  only  be  abLe  to  control 
general  access  to  the  system,  but  should  also  support  the 
capability  to  provide  different  levels  of  access  to 
different  users.  In  Table  15,  three  of  the  four, 
DATADICTION ART ,  DATAMANAGER,  and  ORACLE  receive  high  marks 
for  providing  both  aspects  of  security.  Security  for  infor¬ 
mation  contained  within  DATA  DESIGNER  must  be  provided  by 
the  parent  DBMS. 

Fifth,  the  clearness  and  logical  Layout  of  system  docu¬ 
mentation  should  be  considered.  Additionally,  the  reports 


•  J 


o 

‘I 

i 


85 


Category  Three 

Functional  Category 

TABLE  14 

Relative  Ease  of 

FIPS  |  ISP  |  ADR 

032 

DDE  | 

ORA 

Menu-Driven 

3 

0  3  ! 

3  ! 

0 

New  Oser  Friendly 

3 

12] 

"T"l 

2 

Good  Setup  Example 
in  Documentation 

3 

,  '  2  1 

! 

3 

3 

Category  Subtotals 

9 

2  7 

5 

C 

TABLE  15 

Category  Four:  Security 


Functional  Category 

Access  Control 
(Passworl) 

Degrees  of  Access 
(Levels) 

DA-only  Privileges 
Category  Subtotals 


FIPS 


ISP  ADR 
3  3 


DDE  1  ORA 


9  J  9 


a  ]  3 


and  the  documentation  prepared  by  the  data  dictionary  must 
be  evaluated  for  usability.  As  indicated  in  Table  16,  each 
of  the  four  data  dictionaries  approaches  that  of  our  ideal 
FIPS  standard.  It  is  interesting  to  note  that  the  two 
frontrunners,  DATA  DICTION  ARY  and  DATA MANAGE R ,  have  some 
problems  with  documentation  complexity. 


86 


TABLE  16 

Category  Five:  Documentation  and  Reports 


Functional  Category 

FIPS 

ISP 

ADR 

DDE 

ORA 

SYS  Documentation 
clear/laid  out  well 

3 

2 

- ! 

2 

3 

3 

Good  Examples  of 
Report  Types 

3 

2 

3 

3 

3 

...... 

Reports  Readable 

3 

3 

3  i 

3 

Category  Subtotals 

9  I 

7  ! 

3 

9 

9 

Finally,  the  ideal  data  dictionary  should  support  a 
variety  of  applications,  interfacing  with  both  DBMS  and 
programming  languages.  DATADESI3NE3  and  DATAMANAGER  both 
provide  interfaces  to  one  or  more  DBMS  and  to  two  or  more 
programming  languages.  Table  17  pertains.  While  DATA 
DESIGNEE  and  ORACLE  only  interact  with  their  system  DBMS, 
DATAMANAGER  provides  flexibility  and  versatility  by 
supporting  several  popular  DBMS. 


TABLE  17 

Category  Six:  Application  Interfaces 


Functional  Category 

FIPS  ; 

MSP  | 

ADR 

DDE  i 

ORA 

DBMS  Interface (s) 

3 

3  ! 

2 

“l~| 

1 

Language  Interfaces 

3 

3  j 

3 

1  1 

1 

Category  Subtotals 

6  , 

6  | 

5  , 

2  1 

2 

When  total  "snores"  are  calculate!,  the  results  are  as 
shown  in  Table  18.  While  none  of  the  systems  provides  all 
of  the  characteristics  of  the  iieal/FIPS  system,  ADR 
DA  It.  DICTIONARY  and  MS?  DATAMANA1EE  come  the  closest.  If  an 


organization  were  starting  "fresh",  with  no  previous  invest- 


TABLE  18 

Data  Dictionary  Coaparisoa  Totals 


Functional  Category 

FIPS 

MSP  ; 

ADR 

DDE 

ORA 

Schemas/Extensible 

6 

6 

6 

1 

0 

Command/2uery  Lang. 

9 

9 

9 

4 

5 

Ease-of-Ose 

9 

2  i 

7 

5 

5 

Security 

9 

9 

9 

4 

3 

Document  a ti on /Rpts 

9 

7 

8 

9 

9 

Application  Inter. 

6  ' 

6 

5 

2  i 

2 

Comparison  Totals 

48  ] 

39  | 

44 

25  | 

29 

ment  in  software,  the  ADR  family  of  products,  RIME,  warrants 
serious  consideration.  If,  on  the  other  hand,  the  organiza¬ 
tion  already  has  one  of  the  popular  DBMS,  and  is  simply 
seeking  to  add  a  new,  or  Letter,  data  dictionary,  the  free¬ 
standing  DATA. MANASES  might  very  well  satisfy  the  need.  In 
each  of  these  two  excellent  commercial  packages,  the 
observed  shortcomings  lie  in  the  areas  of  user  friendliness 
and  clear  examples  for  new  users.  Although  important 
re guirements,  these  faults  will  be  overcome  as  the  users 
gain  experience. 

In  the  case  of  the  other  two  dictionaries,  their  short¬ 
comings  would  be  far  harder  to  forgive.  Their  problems  lie 


in  areas  of  standard  schemas,  extensioility,  security,  ate. 
Each  seems  more  user-friendly,  bat,  since  they  do  less, 
there  are  fewer  procedures  to  be  explained.  DMA  DESIGNER, 
altnough  an  interesting  package,  simply  does  not  provide 
several  of  the  primary  cha racteristics  that  we  expect  to 
find  in  an  ideal  data  dictionary.  ORACLE  is  certainly  the 
weakest  of  the  four  dictionaries  we  evaluated.  As  part  of 
the  ORACLE  DBMS,  this  system  .does  provide  some  data 
dictionary  features.  However,  it  is  not  the  full-featured 
data  dictionary  we  would  recommend. 


VI.  EXPANSIONS  OF  THE  ROLE  OF  DATA  DICTID S  ARIES 


In  this  chapter  we  will  suggest  ways  in  waich  the  cole 
of  the  data  dictionary  can  be  expanded  beyond  the  basic  uses 
discussed  in  previous  chapters.  3 e  will  loon  first  at  how 
the  data  dictionary  can.  enforce  standards  in  today's 
increasingly  common  distriDuted  data  processing  environment. 
Then  we  will  show  how  the  process  of  decision  making  can  be 
supported  through  the  use  of  a  data  dictionary.  In  conclu¬ 
sion,  we  will  attempt  to  foresee  where  data  dictionary  tech¬ 
nology  will  lead  information  resource  management  in  the 
years  to  come. 

A.  DISTRIBUTED  DATA  PROCESSING 

Cur  discussion  of  databases  up  to  this  point  has 
centered  around  the  assumption  that  an  organization  has  one 
centralized  database,  with  centralized  database  management 
and  control,  that  would  be  accessed  by  all  users.  However, 
many  organizations  have  decided  to  distribute  computing 
power  to  various  departments  and/or  outlying  sites, 
depending  on  the  organization's  structure.  In  such  a  situ¬ 
ation,  it  is  also  likely  that  the  organization's  database 
will  have  to  be  distributed.  A  distributed  database  is  "a 
consistent,  logically  interrelated  collection  of  data  stored 
at  dispersed  locations"  [Ref.  28].  These  dispersed  loca¬ 
tions,  called  nodes,  are  connected  by  means  of  a  network 
which  allows  the  nodes  to  communicate. 

Many  factors  have  contributed  to  the  increasing  popu¬ 
larity  of  distributed  processing.  Two  of  the  most  important 
are  the  following:  [Ref.  29] 


1.  Numerous  advances  in  technolog/  that  have  provided 
more  powerful  processing  hardware  at  lower  cost  and 
improved  communicat i on  and  network  cap  a o ili t ies  . 

2.  The  need  for  faster  and  easier  access  to  time- 

critical  information  to  assist  in  the  decision 
making  of  organizations  with  geographically 

dispersed  components  requiring  unified  information 
sharing  and  processing-  (This  concept  will  be 
discussed  in  detail  in  the  ne<t  section. ) 

For  organizations  that  employ  a  centralized  approacn  to 
control  widely- dispersed,  autonomous  divisions,  an  attempt 
to  adhere  to  the  traditional  concepts  of  centralized  infor¬ 
mation  resources  may  be  ineffective  and  seL f - de f ea ting . 
These  organizations  might  be  tempted  to  sacrifice  the 
ability  to  batter  satisfy  user  needs  in  order  to  preserve 
control  and  traditional  relationships.  Fortunately, 

managers  are  rapidly  becoming  aware  of  the  many  potential 
advantages  of  distributing  some,  or  all,  of  the  organiza¬ 
tion's  data  processing  functions  to  the  user  level. 
Technological  advances  continue  to  encoucage  these  changes 
beca  use 

The  availability  of  major  computing  resources  in  small, 
low-cost  packages  allows  the  dedication  and  distribution 
of  needed  capabilities,  either  standing  aloae  or  inter¬ 
connected,  when  and  where  they  are  needed.  iar.y  of  the 
complexities  of  centralized  large-scale  computing  facil¬ 
ities  are  no  longer  necessary.  [Ref.  30] 

It  is  important  to  remember ,  however,  tnat 

the  complexities  of  integrated  systems  require  digital 
data  communications,  appropriate  software,  and  extensive 
planning  and  coordination.  These  complexities  should 
not  be  underestimated.  [Ref.  30] 

One  very  successful  corporation,  Hewlett-Packard, 
utilizes  a  combination  of  centralized,  decentralized,  and 


distributed  systems  to  support  a  variety  of  needs  within  the 
organization.  Corporate  planning,  employee  benefits,  and 
establishment  of  standards  are  performed  on  mainframes 
located  at  central  management.  Daily  operations,  data 
processing,  and  employee  pay  and  records  have  been  decen¬ 
tralized  and  are  independently  performed  by  each  division. 
Otaer  functions,  a.g.,  customer  sales  and  support,  have  been 
distributed  to  increase  responsiveness  and  timeliness.- 


Successful  systems  put  the  control  of  the  data  close  to 
the  source  of  the  information  and  the  control  of 
processing  close  to  the  manager  responsible  for  the 
function  being  performed.  In  an  organization  like 
Hewlett-Packard,  this  will  fregueatly,  but  not  always, 
imply  distributing  the  processing.  Distributee 
processing  has  made  it  possible  for  us  to  adapt  to  a 
constantly  expanding  geographic  operation,  and  a 
constantly  changing  organizational  structure,  while 
maintaining  consistent  administrative  support. 

‘  Ref.  31  ] 


Another  class  of  organization  includes  tnose  that  have 
become  so  large  and  dispersed  that  they  simply  cannot  be 
supported  effectively  by  totally  centralized  resources.  The 
armed  services  ace  prime  examples  of  this  type.  For 
example , 


In  an  organization  as  large  and  decentralized  as  the 
Navy,  it  would  be  impossible  and  inappropriate  to  impose 
centralized  control  over  the  thousands  of  individual 
small  system  applications  that  are  clearly  being  put  to 
prpductive  use.  In  fact,  their  main  strength  is  their 
ability  to  solve  many  of  the  information-handling  prob¬ 
lems  or  users  at  the  local  level,  without  tie  need  for 
centralized  software  development  and  procurement  delays. 
:  Ref.  32] 


In  the  years  ahead,  a  growing  awareness  of  these  conditions 
will  drive  an  ever- increasi  ng  numoer  of  military  organiza¬ 
tions  to  distribute  some  portion  of  their  information 
resource  needs. 

Data  dictionaries  that  are  designed  for  operations 
within  distributed  environments  will  raguire  ail  of  the 


capabilites  of  those  operating  solely  in  a  cer.tcalized  envi¬ 
ronment.  However,  a  distributed  data  dirtionary  must 
support  three  specialized  functions  in  addition  to  basic 
data  dictionary  functions: 

1.  the  ability  to  locate  data  witain  the  network 

2.  the  co  ordination/m  a  nag  emen.t  of  distributed  data 

3.  the  ability  to  perform  data  transformation  in 
support  of  user  applicatioas 

The  distributed  data  dictionary's  directory  function  enables 
it  to  identify  which  network  node  contains  the  specific 
information  that  is  needed.  Whether  the  particular  database 
is  distributed  by  replication  or  partitioning,  the  data 
dictionary  must  provide  information  about  its  logical  and 
physical  characteristics. 

In  the  case  of  replicated  data,  where  functionally  iden¬ 
tical  copies  of  the  data  are  stored  at  multiple  nodes  in 
the  network,  the  distributed  DD/D3  "data  dictionary! 
must  have  knowledge  of  the  known  redundancies  throughout 
the  network.  Synchronization  of  updates  in  tiis  case  is 
critical.  [Ref.  33] 

In  a  partitioned  database,  where  only  certain  portions  of 
the  database  are  located  at  individual  nodes,  the  data 
dictionary's  cole  becomes  even  more  important  because  "it 
must  know  the  relationships  amony  the  pieces,  and  be  able  to 
manage  all  the  pacts,  such  that  this  physical  dispersion  of 
the  data  is  transparent  to  the  user"  [Ref.  34].  Finally, 
the  distributed  data  dictionary  may  be  required  to  perform 
transformation  of  data  to  support  various  users.  If  serving 
a  heterogeneous  network--one  in  which  dissimilar  types  of 
hardware  and  software  coexist--the  data  dictionary  will  have 
to  translate  between  different  data  and  storage  structures. 

The  distributed  DD/DS  ‘data  dictionary]  can  facilitate 
these  translation  processes  by  providing  the  metadata 
mappings  to  allow  the  source  to  be  transformed  into  tue 
target  data.  This  is  accomplished  by  storing  in  tue 
data  dictionary  the  source  and  target  metadata  descrip¬ 
tions  to  be  used  by  the  mapping  process.  [Raf.  35] 


It  is  possible  for  the  distnoution  of  dictionary  capa¬ 
bilities  to  be  accomplished  by  several  alternative  configu¬ 
rations.  Due  possible  coaf ig uration,  as  mentioned  earlier, 
involves  duplicating  the  data  dictionary  in  its  entirety  at 
each  node  of  the  network.  An  example  of  this  is  shown  as 
Figure  6.1.  (Dashed  lines  indicate  node-to-noie  communica¬ 
tions  and  dotted  lines  indicate  d ict ionary- t o- diet  ion arv 


Ke twor k  Node 


|  DATA  DICTIONARY  |  .  I  DATA  DICTIONARY  | 


Network  Node 


Network  'lode 


Figure  6.1  Duplicated  Data  Dictionaries 

communications.)  Each  data  dictionary  will  contain  a 
complete  copy  of  the  entire  organization's  metadata.  While 
the  nodes  themselves  will  interact  frequently,  the  various 
copies  of  the  dictionary  will  not.  do w  ever,  when  one  copy 
of  the  dictionary  is  updated,  all  other  copies  must  be  auto¬ 
matically  updated  if  data  integrity  is  to  be  maintained. 
This  duplication  of  metadata  will  result  in  some  degree  of 
additional  overhead,  but  it  will  Lmpcove  the  responsiveness 
of  the  system  and  minimize  the  necessity  of  inter-data 
dictionary  queries.  In  some  implementations,  communication 
costs  can  be  significantly  reduced.  This  configuration  will 
be  most  desirable  in  cases  in  which  the  organization's  data¬ 
base  (s)  are  also  duplicated  at  each  node  or  if  nodes  would 


5 


be  likely  to  access  eacn  other's  metadata  orten.  A  stable 
organ izatior.  with  well-establishel  data  processing,  where 
metadata  is  not  continuously  being  updated,  would  benefit 
most  from  this  configuration. 

In  the  second  configuration,  the  data  dictionary  is 
partitioned  anong  the  various  network  nodes.  As  shown  in 
Figure  6.2,  each  node  contains  only  that  portion  of  the 
dictionary  that  contains  the. metadata  it  reguices.  No  one 


Network  Node 


DD  Partition 


DD  Partition 


Network  Node 


Figure  6.2  Partitioned  Data  Dictionary  (DD) 


node  or  station  within  the  system  will  nave  a  complete  data 
dictionary.  This  configuration  would  be  used  when  there  is 
not  much  need  for  the  nodes  of  the  network  to  access  each 
other’s  metadata  and  there  is  a  relatively  clear-cut  differ¬ 
entiation  between  the  functions  oeing  carried  on  at  each 
node,  which  implies  different  metadata.  Because  redundancy 
is  kept  to  an  absolute  minimum,  problems  could  arise  if  a 
node’s  data  dictionary  partition  were  lost  unless  good 
backup  procedures  were  in  effect.  Since  eaca  node  is  only 
responsible  for  maintaining  its  own  portion  of  the  whole, 
there  is  littLe  update  overhead  and  thus  little  system  delay 
as  long  as  the  required  metadata  exists  at  that  particular 


MICROCOPY  RLSOLUTION  UST  CHARI 


In  the  final  conf igura tion,  tne  data  dictionary  is 
distributed  ia  a  hierarchical  structure.  There  will  be  one 
"caster"  copy  of  the  dictionary  and  one  or  core  partial 
copies  throughout  the  network,  as  shown  in  Figure  6.3.  In 
this  configuration,  each  node  that  contains  a  portion  of  the 


Network  Node 


DATA  DICTIDNAHT 


1  DD  Partition  1  |  DD  Partition  |  j  DD  Partition  | 
|  Networt  node  | — 1  Network  Node  |--|  Network  Node  | 


Figure  6.3  Hierarchy  of  Distributed  Data  Dictionaries 


data  dictionary  is  responsible  for  updating  the  master 
dictionary  whenever  its  portion  is  modified.  This  structure 
ensures  data  integrity  and  provides  flexibility  by  allowing 
varying  amounts  of  metadata  to  oe  distributed.  Another  use 
foe  this  hierarchical  structure  might  be  to  separate  func¬ 
tionality  within  a  network,  e.g.,  database,  automated 
office,  and  programming  functions.  Each  of  these  functions 
is  able  to  maintain  its  portion  of  the  dictionary  locally 
while  one  master  copy  is  available  to  handle  inter-partition 
quer ies. 

There  are  presently  several  commerical  packages  in  the 
development  or  testing  stages  that  will  be  able  to  satisfy 
the  requirements  of  distributed  processing.  Dne  system  that 
is  already  available  and  being  used  in  numerous  applications 


is  ADR's  Relational  Information  Management  Environment 
(RIME)  system.  As  discussed  in  Chapter  V,  this  system 
features  fourteen  separate  components  that  can  oe  integrated 
into  one  "total"  system.  One  component,  D-NET,  combines  a 
database,  data  dictionary,  and  communications  interfaces  to 
support  the  special  requirements  of  distributed  processing. 
D-NET  is  capable  of  supporting  both  homogeneous  and  hetero¬ 
geneous  networks: 


The  flexibility  provided  by  D-NET  and  the  other  software 
components  allows  users  to  configure  the  distributed 
system  networks  based  on  the  needs  of  each  node. 
Various  operating  systems,  computer  types,  and  cooper¬ 
ating  software  products  can  be  used  to  create  a  specific 
environment  without  impacting  application  development 
and  operations.  [Ref.  36] 


D-NET  can  implement  the  system's  data  dictionary, 
DA  TADIC  TICNARf,  as  either  one  centralized  dictionary  or  as 
multiple  copies  stored  at  remote  locations.  Similarly, 
RIME'S  database,  DAIAC0M/D3,  can  be  maintained  either  at  one 
centralized  location  or  distriouted  to  various  nodes 
throughout  the  network.  D-NET  serves  as  the  basis  of  the 
Army's.  project  VIABLE,  providing  numerous  Denefits  that 
include  cost  effectiveness,  highly  expandable,  increased 
productivity,  resource  control  and  synchronization,  and 
independent  operation  at  the  local  user's  level. 


B.  DECISION-HAKIHG 

In  this  section  we  will  show  how  the  data  dictionary 
provides  managers  with  the  efficiently  recorded,  accurate, 
and  timely  information  necessary  to  maxe  decisions  in  conso- 
naice  with  the  goals  of  the  organization,  whether  in  a 
centralized  or  distributed  environment.  According  to  the 
report  of  the  Committee  on  Review  of  Navy  Long-Range  ADP 
Planning,  "information  technology",  which  includes  data 
dictionaries,  is 


-critical  to  the  ‘law's  abilitv  to  fulfill  Lts  wartime 
and  peacetime  roles  in  an  optimal  inner.  The  available 
technologies  would  enable  the  Navy  to  approach  its 
aissiors  with  information  and  data  that  (1)  ‘  have  been 

collected  and  recorded  simply,  (2)  have  impcoved  accu¬ 
racy,  (3)  have  been  speedily  reported,  collated.  and 
distributed.  (4)  lead  to  summaries  that  are  timely  and 
to  the  point,  as  and  wnen  needed,  and  (5)  nave  enabled 
noth  manpower  commitments  and  costs  to  be  reduced. 
I  Ref.  37] 


1 .  The  Decision-Making  P£0£§ss 

Herbert  Simon's  classic  model  of  the  decision-making 
process,  as  cited  by  Sprague  and  Sarlson  ] Ref.  38],  consists 
of  three  distinct  steps:  intelligence,  design,  and  choice. 
The  use  of  a  data  dictionary  supports  the  decision  maker  as 
he  takes  each  step. 

a.  Intelligence  involves  searching  the  environment 
for  conditions  calling  for  decisions.  Raw  data  must  be 
obtained,  processed,  and  examined  for  clues  taat  may  iden¬ 
tify  problems.  However,  so  much  data  is  available  within  an 
organization  that  a  seemingly  infinite  parade  of  information 
can  be  produced--this  situation  is  called  information  over¬ 
load.  There  must  be  some  way  of  narrowing  down  the  amount 
of  information  that  is  presented  to  the  decision  maker.  A 
data  dictionary  used  in  conjunction  with  a  database  can  play 
an  important  cole  in  this  narrowing  process.  As  discussed 
earlier  in  tne  thesis,  the  dictionary  helps  aa  organization 
identify  and  eliminate  redundant  data.  Its  guery  language 
can  be  used  to  select  ir.f omation  about  a  particular  entity 
and  its  report  definition  capability  can  be  used  to  generate 
aggregate,  rather  than  detailed  data.  Relationships  betwen 
entities  are  easily  identified  so  that  managers'  questions 
such  as  "What  is  the  range  of  values  for  'Readiness  Status' 
data?”  and  "Which  departments  receive  the  'Ammunition 
Transaction'  report?"  can  be  answeced. 


fc.  Design  entails  inventing,  developing,  an 1 
analyzing  passible  courses  of  action.  Tiis  involves 
processes  to  understand  the  problem,  to  generate  solutions, 
and  to  test  solutions  for  feasibility.  The  data  dictionary 
plays  a  key  cole  in  documenting  tie  decision  Baker’s  envi¬ 
ronment  so  that  he  or  she  will  have  a  ceitralized  source  of 
information  froa  which  to  develop  possible  choices.  The 
dictionary  can  also  be  used  to  tailor  information  to  meet 
specific  needs  by  defining  user  views  of  data  and 
restricting  user  access  to  certain  data.  In  this  way,  users 
can  be  presented  only  with  the  information  they  are  supposed 
to  have  and  need  to  have,  as  determined  by  higher  authority 
in  the  organization,  instead  of  having  to  deal  with  non- 
essential  information. 

In  addition  to  recording  information  aoout  the 
plans,  structure,  and  functions  of  the  organization,  the 
data  dictionary  can  also  be  used  to  record  information  about 
the  decision  makers  themselves.  In  the  case  of  the  d.S.S. 
Constellation,  for  example,  information  aoout  tie  commanding 
officer  and  the  key  elements  of  his  environment  can  be  docu- 
meited:  which  decisions  he  wishes  to  make  and  which  ones 
his  subordinates  will  make,  the  mission  assigned  to  tne 
carrier  by  the  C.O. 's  superiors,  the  relative  priorities  he 
attaches  to  various  subjects,  his  snort  term  and  long  term 
pecsonal  goals,  previous  decisions  he  has  made,  and  so  on. 

c.  involves  selecting  a  particular  course  of 
action  from  those  available  and  implementing  that  cnoice. 
Of  course,  the  ultimate  decision  will  lie  witi  the  decision 
maker,  and  not  with  the  data  dictionary.  At  best,  the  data 
dictionary  can  present  options  to  the  decision  maker  and, 
once  the  choice  is  made,  can  document  the  steps  taken  to 
implement  that  choice. 


2.  Crisis  Management 


Tne  accuracy  and  timeliness  of  information  provided 
to  the  decision  maker  becomes  of  critical  importance  wnen 
the  decision-making  process  occurs  during  a  crisis  situ¬ 
ation.  In  wartime,  for  example,  there  is  usually  a  great 
deal  of  risk  associated  with  a  decision:  many  decision 
makers  are  involved,  information  mast  be  consolidated  from  a 
variety  of  sources  and  locations,  little  time  is  available 
to  make  decisions,  and,  due  to  the  uniqueness  of  events, 
there  is  often  no  pre-defined  structure  for  making  the  deci¬ 
sion.  There  are  four  ways  that  tae  data  dictionary  can 
prove  especially  helpful  in  crisis  decision-maxi ng. 

a.  The  dictionary  speeds  up  the  information- 
gathering  process.  As  discussed  earlier,  user  views  and 
accesses  have  been  pre-defined  and  can  be  changed  easily  as 
needed.  Active  data  dictionaries  provide  for  automatic 
update  of  acy  changes  that  are  made,  so  information  is 
always  current. 

b.  The  dictionary  prioritizes  information.  Tne 
priorities  of  the  organization  and  the  decision  makers  are 
taken  into  account  and  can  be  updated  as  events  occur.  In. 
this  way,  the  attention  of  decision  makers  is  focused  on 
truly  important  information  rather  than  dispersed  over  a 
wide  range  of  information. 

c.  The  dictionary  provides  a  common  information 
base.  This  is  important  when  many  decision  makers  at 
different  locations  are  involved.  All  participants  have  the 
latest  information  and  can  also  taxe  advantage  of  the 
•'corporate  memory"  provided  by  the  dictionary. 

d.  In  short,  the  dictionary  provides  ’intelligent" 
information  management.  It  reduces  information  overload, 
tailors  information  to  specific  decision-maker  s'  needs,  and 
responds  well  to  infrequent,  ad  hoc  requests.  It  helps  to 
establish  relationships  between  evants  as  they  occur. 

100 


The  typical,  or  even  "ideal",  data  dictionary  will 
not  be  able  to  fully  support  the  decision-making  process 
without  the  help  of  additional  sopaisti catei  software  to 
tace  advantage  of  its  capabilities.  tfe  believe  that  as  the 
acceptance  and  use  of  the  data  dictionary  as  a  tool  for 
information  resource  management  become  widespread,  the 
denand  for  an  expanded  role  for  the  dictionary  will 
increase.  Organizations  must  become  more  accomplished  in 
the  top-down  planning  process  of  the  system  development  life 
cycle  in  order  to  receive  maximum  benefits  from  data 
dictionary  technology. 

C.  CONCLUSIONS 

In  this  thesis,  we  have  discussed  the  structure,  func¬ 
tions,  and  objectives  of  a  data  dictionary.  We  have 
compared  popular  commercial  products  to  an  "ideal" 
dictionary  based  on  criteria  we  developed  and  on  TIPS  DOS 
guidelines.  Wa  have  analyzed  the  cole  of  a  data  dictionary 
in  information  resource  management,  iacluiing  its  support  of 
a  distributed  data  processing  environment  and  of  the 
decision-making  process.  It  seems  clear  that  as  organiza¬ 
tions  become  cognizant  of  the  need  to  manage  tieir  informa¬ 
tion  efficiently,  the  importance  and  necessity  of  data 
dictionary  implementation  will  continue  to  increase. 

Designers  of  data  dictionaries  are  aware  of  these  trends 
and  are  moving  in  the  following  directions: 

First,  toward  what  is  known  as  an  integrated  data 
dictionary  and  second,  toward  a  free-standing  dictionary 
that  serves  as  a  driver  of  a  distributed  data  processing 
system  made  up  of  several  types  of  computers,  data  base 
management  systems,  fiie  managers,  and  text  editors. 

' Ref.  39 ] 

In  reference  to  the  first  projection,  several  commercial 
systems  have  teen  developed  that  feature  integration  of  a 


with  a  database. 


data  dictionary  with  a  database.  One  example  of  this  is 
ADS's  F.IME  wa  ich  f eatacas  integration  of  a  database  and  a 
data  dictionary  with  numerous  otne  r  components  to  form  one 
very  capabie  and  flexible  system.  Addressing  the  second 
projection,  Rullo  [fief.  40]  foresees  development  of  a  "supec 
data  dictionary”  to  support  future  integrated  and  distrib¬ 
uted  systems: 


In  this  environment,  the  data  dictionary  would  act  as  a 
driver  of  tie  system.  The  data  dictionary/data  direc¬ 
tory  miaht  also  have  some  integrated  facilities  permit¬ 
ting  transfer  of  data  among  other  system  software 
functions  includina  itself.  Tnere  is  a  tread  in  this 
direction,  with  other  systems  depending  on  the  data 
dictionary/lata  directory  and  that  system  itself  begin¬ 
ning  to  resemble  a  model  of  the  enterprise. 


He  believe  the  future  holds  significant  improvements  and 
expansions  of  data  dictionary  technology.  It  is  important 
that  the  development  of  standards  for  data  diet  ion ary 
compatibility  continue  along  with  the  development  of  stan¬ 
dards  that  are  currently  being  developed  to  support  network 
communications.  It  is  conceivable  that  these  standards,  if 
widely  accepted,  would  allow  aoy  data  dictionary  to  "taltc" 
to  another  and  to  exchange  information.  The  FIPS  DDS  stan¬ 
dards  developed  by  the  National  3ureau  of  Standards  will 
most  likely  become  the  basis  for  data  dictionaries  procured 
and  used  by  the  federal  government. 

We  also  foresee  the  use  of  fourth  generation  languages, 
the  extremely  user-friendly,  "close- to-natur al-language" 
languages  that  will  facilitate  user  access  to  the  diction¬ 
ary's  metadata.  These  languages  will  replace  the  formal 
command  languages  and  awkward  syntax  described  earlier  in 
the  thesis.  Another  factor  contributing  to  tne  increased 
utility  of  data  dictionaries  will  be  the  use  of  sophisti¬ 
cated  software  and  artificial  intelligence  techniques  in 
conjunction  with  the  dictionary.  As  the  central  source  of 


02 


data  about  an  organization,  the  lata  dictionary  contains  a 
broad  base  of  information  upon  wnicn  an  artificial  intelli¬ 
gence  "expert"  system  can  be  built.  For  example,  it  is 
possible  that  an  expert  system  would  be  able  to  verify  and 
validate  additions  to  the  dictionary  schema  based  on  pre¬ 
determined  rules  and  information  gained  from  previous  manip¬ 
ulations  of  the  schema.  It  would  also  be  able  to  establish 
associations  between  the  contents  of  the  data  dictionary  and 
flag  them  for  the  attention  of  the  decision  maxer.  In  audi¬ 
tion,  a  "smart"  data  dictionary  would  oe  able  to  "realize" 
that  every  time  a  user  logs  on  to  the  system,  he  asks  for 
particular  iaforaation,  so  that  eventually,  the  data 
dictionary  will  provide  it  for  him  automatically. 

Mo  matter  what  changes  occur  in  data  dictionary  tech¬ 
nology,  the  data  dictionary's  role  in  the  efficient  aianage- 
meit  of  an  organization's  information  resource  will  continue 
to  be  an  increasingly  important  one.  The  dictionary  will 
support  the  organization  in  its  planning  and  analysis  of 
functions,  its  development  of  information  systems,  the  main¬ 
tenance  of  those  systems,  and  the  intelligent  use  of  those 
systems.  fle  believe  that  the  military  will  soon  provide  a 
vast  market  for  data  dictionary  software  and  tnat  the 
demands  of  its  users  will  drive  data  dictionary  technology 
even  further. 


APPENDIX  A 
BACKBS-NAOR  PORS 


Backus-Naur  form  is  a  graphic  notation  £or  describing 
the  syntax  of  a  language.  It  is  used  by  the  federal 
Ili!L2£M£i2Il  Processing  Standard  for  Data  Dictionary.  Svsteas 
(FIPS  DDS)  to  show  the  format  of  tie  commands  used  to  manip¬ 
ulate  the  dictionary.  The  following  are  common  9ackus-Naur 
symbols  used  by  the  FIPS  DOS: 

<  >  denotes  a  word  or  phrase 

|  indicates  a  choice  between  two  or  more  alterna¬ 
tives,  "or" 

[  ]  represents  an  option  that  the  user  man  or  may  not 

include 

(  }  is  used  to  set  off  choices  separated  by  "j"  and 

to  enclose  tha  format  of  the  command 

The  syntax  for  the  ADD- ENT  Ilf  command  appears  as 

foLlovs: 

ADD-ENTITY 

{[OF]  {ENTITY-TYPE  |  E-T}  <entl  ty-type-name> 

WHERE  NAME  ; 13  ]  <name-clause> 

[WHEFE  [ATTRIBUTE  |  A}  [FOR]  <  a  t  tr  i  b  u  te-cl  i  J  3  e-  1  > 

'  attribute-clause-n  ]]  ] 

WITH  SECURITY  <security-clause>  ]} 

It  indicates  that  there  are  several,  different  ways  of  adding 
an  entity  to  the  dictionary.  At  a  minimum,  the  command  must 
include  ENTITY-TYPE  or  E-T,  an  entity-type  name,  WHERE  NAME, 
and  a  name  clause.  The  words  OF  and  15  ace  optional,  as  are 
the  last  two  phrases  set  off  by  brackets.  If  the  phrases 
are  used,  the  same  rules  hold  for  choosing  elements  within 
them . 


104 


LIST  DF  REFERENCES 


Kroenke.  David  ft.,  Database  ?ro cessing,  p.  1,  Science 
Research  Associates, “T95J~ 


Committee  on  Review  of  Navy  Long-Range  AD?  Planning, 
"Navy  Nontactical  Automatic  Data  Processing  Policv, 
Organization  and  Manageaent--Iatecim  Report  to  the 
United  States  Navy,"  p.  2,  July  1993 


Darrell,  iiilliam,  "Disorder  to  Discipline  7ia  the  Data 
Dictionary,"  Journal  of  Systems  lanaqement,  v.  34,  0. 
19,  May  19§3 - - - 


Applied  Data  Research,  Inc.,  Introduction  to 
MSlDICr  ID  NARY,  1982 


Lefkovits,  Henry  C.  ,  Sibley,  Edgar  H.,  and  Lefkovits, 
Sandra  L.  ,  Information  Resource/Data  Dictionary 
Systems,  p.  2- 6,-JEiJ-rnf  osciencas7-T933 


National  Bureau  of  Standards,  federal  Information 
Processing  Standard  for  Data  Dictionary  3ystels7 
iTugusf~T9‘9  3 


Goldfine,  Alan  H.,  ed . ,  Data  Base  Directions : 
Information  Resource  Management-^Strafegles  ar.cl  Tools, 
Nafrdnar"Euceau~of-Standaca:sr-pr-T77_^epf2mfce“17H2— 


Goldfine,  p.  18 


Leong-Hong,  Belkis  N.  and  Plagaan,  Bernard  K. ,  Data 
Dictionary/Directory  Systems:  Ad n i nistra t!on7 
Implementation ,  “and  Usage,  _“pp.  25- 5 3^  Jonh  7i  ley  s 
S'ons7_T7B’2‘ 


Leong-Hong  and  Plagman,  p.  32 


Leong-Hong  and  Plagman,  p.  45 


Allen,  Frank  W.,  Loomis,  lary  E.  S. ,  and  Mannino, 
Michael  7.,  "The  Integrated  Dictionary/Directory 
System,"  Association  for  Computing  Machinery  Computing 
Surveys,  v7"T?7_PP"255=2r77  June  1?SZ -  - - 


Leong-Hong  and  Plagman,  p.  224 


Van  Duyn,  J.  A.,  Developing  a  Data  Dictionary  System, 
p.  39,  1932  “ 


Schussell,  George,  "The  Sola  of  tae  Data  Dictionary," 
Datamation,  v.  23,  p.  133,  June  1977 

Kreitzer,  Lawrence  ».  ,  "Data  Diet i onari a s-- The  Heart 
of  LEM,"  Inf os.£Steiii3,  v.  28,  p.  64,  March  1581 

Van  Duva,  pp.  39-40 

Leong-Hong  and  Plagman,  p.  132 

Database  Design.  Inc..  Data  Designer  Jsar  Guide,  pp. 
3-4,  1993 

MSP,  Inc.,  DATAMANAGER  User^s  Guide,  pp.  1-2,  1983 

MSP,  Inc.,  pp.  2-6 

Goetz,  Martin,  'The  ADS  Integrated  System", 
I n f osystea s,  v.  29,  p.  4,  Septenoer  1982. 


Lefkovits , 

Sibley, 

and 

Lefkovits, 

P- 

6-64 

Kroenke,  p. 

1 1  7 

Lefkovits, 

Sibley, 

aad 

Lefkovits, 

F* 

1-43 

Lefkovits, 

Sibley, 

and 

Lefkovits, 

P- 

1-47 

Allen,  Loomis,  and  Mannino,  p.  256 
Leong-Hong  and  Flagman,  p.  229 

Committee  on  Feview  of  Navy  Long-Range  A D P  Planning, 
p.  9 

Van  Rensselaer,  Cort,  "Centralize?  Decentralize? 
Distribute?",  DATAMATION,  v.  25,  p.  57,  April  1979 

Committee  on  Review  of  Navy  Long-Sange  AD?  Planning, 

p.  1 3 

Leong-Hong  and  Plagman,  p.  234 

Leong-Hong  and  Plagman,  p.  234 

Leong-Hong  and  Plagman,  p.  236 


Applied  Data  Research,  Inc., 
Facilities,  p.  3,  199* 


ADR/D-NEI: 


~.2Q.£2Ei2  and 


36 . 


37. 

Committee  on  F.eview  of  Navy 
p.  3 

Long-Range 

AD?  Planning, 

33. 

Sprague,  Ralph  K.  ,  Jc.,  and  0 
Effective  Decision  Support 
?f€nfTc3“Harr7_T?S2  ~  - 

arlson,  Eri 
Systems, 

c  D.,  Building 

pp. 

39 . 

Hullo,  Phooas  A.,  Advances  in 
13  6,  Heyder.  5  Son,  Inc.,  1957 

Data  Base 

li  0.3.3.222111*  p. 

40. 

Eallo ,  p .  139 

107 


-IfO 


INITIAL  DISTRIBUTION  LIST 


E 


END 

FILMED 

4-85 


DTIC 


